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PROBES AND DECODER OLIGONUCLEOTIDES 

This application claims the benefit of U.S.S.N.s 60/227,948 filed August 25, 2000 and 60/228,854, filed 
August 29, 2001 , both of which are expressly incorporated herein by reference. 

5 

FIELD OF THE INVENTION 

7 The present invention is directed to methods and compositions for the use of adapter sequences on 
arrays in a variety of nucleic acid reactions, including synthesis reactions, amplification reactions, and 
10 genotyping reactions. 

BACKGROUND OF THE INVENTION 

The detection of specific nucleic acids is an important tool for diagnostic medicine and molecular 
1 5 biology research. Gene probe assays currently play roles in identifying infectious organisms such as 
bacteria and viruses, in probing the expression of normal and mutant genes and identifying mutant 
genes such as oncogenes, in typing tissue for compatibility preceding tissue transplantation, in 
matching tissue or blood samples for forensic medicine, and for exploring homology among genes 
from different species. 

20 

Ideally, a gene probe assay should be sensitive, specific and easily automatable (for a review, see 
Nickerson, Current Opinion in Biotechnology 4:48-51 (1993)). The requirement for sensitivity (i.e. low 
detection limits) has been greatly alleviated by the development of the polymerase chain reaction 
(PCR) and other amplification technologies which allow researchers to amplify exponentially a specific 
25 nucleic acid sequence before analysis (for a review, see Abramson et al., Current Opinion in 
Biotechnology, 4:41-47 (1993)). 

Specificity, in contrast, remains a problem in many currently available gene probe assays. The extent 
of molecular complementarity between probe and target defines the specificity of the interaction. 
30 Variations in the concentrations of probes, of targets and of salts in the hybridization medium, in the 
reaction temperature, and in the length of the probe may alter or influence the specificity of the 

1 



probe/target interaction. 

It may be possible under some circumstances to distinguish targets with perfect complementarity from 
targets with mismatches, although this is generally very difficult using traditional technology, since 
5 small variations in the reaction conditions will alter the hybridization. New experimental techniques for 
mismatch detection with standard probes include DNA ligation assays where single point mismatches 
prevent ligation and probe digestion assays in which mismatches create sites for probe cleavage. 

Recent focus has been on the analysis of the relationship between genetic variation and phenotype by 
1 0 making use of polymorphic DNA markers. Previous work utilized short tandem repeats (STRs) as 
polymorphic positional markers; however, recent focus is on the use of single nucleotide 
polymorphisms (SNPs), which occur at an average frequency of more than 1 per kilobase in human 
genomic DNA. Some SNPs, particularly those in and around coding sequences, are likely to be the 
direct cause of therapeutically relevant phenotypic variants and/or disease predisposition. There are a 
15 number of well known polymorphisms that cause clinically important phenotypes; for example, the 
apoE2/3/4 variants are associated with different relative risk of Alzheimer's and other diseases (see 
Cordor et al., Science 261 (1 993). Multiplex PCR amplification of SNP loci with subsequent 
hybridization to oligonucleotide arrays has been shown to be an accurate and reliable method of 
simultaneously genotyping at least hundreds of SNPs; see Wang et al., Science, 280:1077 (1998); 
20 see also Schafer et al., Nature Biotechnology 16:33-39 (1998). The compositions of the present 
invention may easily be substituted for the arrays of the prior art. 

There are a variety of particular techniques that are used to detect sequence, including mutations and 
SNPs. These include, but are not limited to, ligation based assays, cleavage based assays (mismatch 
25" and invasive cleavage such as Invader™), single base extension methods (see WO 92/15712, EP 0 
371 437 B1, EP 0317 074 B1; Pastinen etal., Genome Res. 7:606-614 (1997); Syvanen, Clinica 
Chimica Acta 226:225-236 (1994); and WO 91/13075), and competitive probe analysis (e.g. 
competitive sequencing by hybridization; see below). 

30 Oligonucleotide ligation amplification ("OLA", which is referred as the ligation chain reaction (LCR) 

when two-stranded reactions or nested reactions are done) involves the ligation of two smaller probes 
into a single long probe, using the target sequence as the template. See generally U.S. Patent Nos. 
5,185,243, 5,679,524 and 5,573,907; EP 0 320 308 B1; EP 0 336 731 B1; EP 0 439 182 B1; WO 
90/01069; WO 89/12696; WO 97/31256 and WO 89/09835, all of which are incorporated by reference. 

35 

Invasive cleavage technology is based on structure-specific nucleases that cleave nucleic acids in a 
site-specific manner. Two probes are used: an "invader" probe and a "signalling" probe, that 
adjacently hybridize to a target sequence with a non-complementary overlap. The enzyme cleaves at 
the overlap due to its recognition of the "tail", and releases the "tail" with a label. This can then be 
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detected. The Invader™ technology is described in U.S. Patent Nos. 5,846,717; 5,614,402; 
5,719,028; 5,541 ,31 1 ; and 5,843,669, all of which are hereby incorporated by reference. 



An additional technique utilizes sequencing by hybridization. For example, sequencing by 
5 hybridization has been described (Drmanac et al., Genomics 4:1 14 (1989); Koster et al., Nature 

Biotechnology 14:1123 (1996); U.S. Patent Nos. 5,525,464; 5,202,231 and 5,695,940, among others, 
all of which are hereby expressly incorporated by reference in their entirety). 

Sensitivity, i.e. detection limits, remain a significant obstacle in nucleic acid detection systems, and a 
1 0 variety of techniques have been developed to address this issue. Briefly, these techniques can be 
classified as either target amplification or signal amplification. Target amplification involves the 
amplification (i.e. replication) of the target sequence to be detected, resulting in a significant increase 
in the number of target molecules. Target amplification strategies include the polymerase chain 
reaction (PCR), strand displacement amplification (SDA), and nucleic acid sequence based 
15 amplification (NASBA). 

Alternatively, rather than amplify the target, alternate techniques use the target as a template to 
replicate a signalling probe, allowing a small number of target molecules to result in a large number of 
signalling probes, that then can be detected. Signal amplification strategies include the ligase chain 
20 reaction (LCR), cycling probe technology (CPT), invasive cleavage techniques such as Invader™ 
technology, Q-Beta replicase (Q(3R) technology, and the use of "amplification probes" such as 
"branched DNA" that result in multiple label probes binding to a single target sequence. 

-< The polymerase chain reaction (PCR) is widely used and described, and involves the use of primer 
25 extension combined with thermal cycling to amplify a target sequence; see U.S. Patent Nos. 4,683,195 
and 4,683,202, and PCR Essential Data, J. W. Wiley & sons, Ed. C.R. Newton, 1 995, all of which are 
incorporated by reference. In addition, there are a number of variations of PCR which also find use in 
the invention, including "quantitative competitive PCR" or "QC-PCR", "arbitrarily primed PCR" or "AP- 
PCR" , "immuno-PCR", "Alu-PCR", "PCR single strand conformational polymorphism" or "PCR- 
30 SSCP", allelic PCR (see Newton et al. Nucl. Acid Res. 17:2503 91989); "reverse transcriptase PCR" or 
"RT-PCR", "biotin capture PCR", "vectorette PCR". "panhandle PCR", and "PCR select cDNA 
subtraction", among others. 

Strand displacement amplification (SDA) is generally described in Walker et al., in Molecular Methods 
35 for Virus Detection, Academic Press, Inc., 1 995, and U.S. Patent Nos. 5,455,1 66 and 5,1 30,238, all of 
which are hereby incorporated by reference. 

Nucleic acid sequence based amplification (NASBA) is generally described in U.S. Patent No. 
5,409,818 and "Profiting from Gene-based Diagnostics", CTB International Publishing Inc., N.J., 1996, 
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both of which are incorporated by reference. 



Cycling probe technology (CPT) is a nucleic acid detection system based on signal or probe 
amplification rather than target amplification, such as is done in polymerase chain reactions (PCR). 
5 Cycling probe technology relies on a molar excess of labeled probe which contains a scissile linkage 
of RNA. Upon hybridization of the probe to the target, the resulting hybrid contains a portion of 
RNA:DNA. This area of RNA:DNA duplex is recognized by RNAseH and the RNA is excised, resulting 
in cleavage of the probe. The probe now consists of two smaller sequences which may be released, 
thus leaving the target intact for repeated rounds of the reaction. The unreacted probe is removed and 
10 the label is then detected. CPT is generally described in U.S. Patent Nos. 5,011,769, 5,403,711, 
5,660,988, and 4,876,187, and PCT published applications WO 95/05480, WO 95/1416, and WO 
95/00667, all of wnich are specifically incorporated herein by reference. 

The oligonucleotide ligation assay (OLA) involve the ligation of at least two smaller probes into a single 
1 5 long probe, using the target sequence as the template for the ligase. See generally U.S. Patent Nos. 
: 5,1 85,243, 5,679,524 and 5,573,907; EP 0 320 308 B1 ; EP 0 336 731 B1 ; EP 0 439 1 82 B1 ; WO 
90/01069; WO 89/12696; and WO 89/09835, all of which are incorporated by reference. 

Invader™ technology is based on structure-specific polymerases that cleave nucleic acids in a site- 
20 specific manner. Two probes are used: an "invader" probe and a "signalling" probe, that adjacently 
hybridize to a target sequence with overlap. For mismatch discrimination, the invader technology 
relies on complementarity at the overlap position where cleavage occurs. The enzyme cleaves at the 
overlap, and releases the "tail" which may or may not be labeled. This can then be detected. The 
Invader™ technology is described in U.S. Patent Nos. 5,846,717; 5,614,402; 5,71 9,028; 5,541 ,31 1 ; 
25. and 5,843,669, all of which are hereby incorporated by reference. 

"Branched DNA" signal amplification relies on the synthesis of branched nucleic acids, containing a 
multiplicity of nucleic acid "arms" that function to increase the amount of label that can be put onto one 
probe. This technology is generally described in U.S. Patent Nos. 5,681 ,702, 5,597,909, 5,545,730, 
30 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 5,594,118, 
5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by reference. 

Similarily, dendrimers of nucleic acids serve to vastly increase the amount of label that can be added 
to a single molecule, using a similar idea but different compositions. This technology is as described 
35 in U.S. Patent No. 5,175,270 and Nilsen et al., J. Theor. Bioi. 187:273 (1997), both of which are 
incorporated herein by reference. 

U.S.S.N.s 09/189,543; 08/944,850; 09/033,462; 09/287,573; 09/151,877; 09/187,289 and 09/256,943; 
and PCT applications US98/09163 and US99/14387; US98/21193; US99/04473 and US98/05025, all 
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of which are expressly incorporated by reference, describe novel compositions utilizing substrates with 
microsphere arrays, which allow for novel detection methods of nucleic acid hybridization. 



The use of adapter-type sequences that allow the use of universal arrays has been described in 
5 limited contexts; see for example Chee et al., Nucl. Acid Res. 19:3301 (1991); Shoemaker et al., 

Nature Genetics 14:450 (1996); U.S. Patent Nos. 5,494,810, 5,830,711, 6,027,889, 6,054,564, and 
6,268,148; and EP 0 799 897 A1 ; WO 97/31256, all of which are expressly incorporated by reference. 

Accordingly, it is an object of the present invention to provide methods for detecting nucleic acid 
10 reactions, and other target analytes, on arrays using adapter sequences. 

SUMMARY OF THE INVENTION 

In accordance with the above objects, the invention also provides a method of detecting a target 
1 5 nucleic acid. The method comprises contacting the target nucleic acid with an adapter sequence such 
that the target nucleic acid is joined to the adapter sequence to form a modified target nucleic acid. In 
addition, the method comprises contacting the modified target nucleic acid with an array comprising a 
substrate with a surface comprising discrete sites and a population of microspheres comprising at 
least a first subpopulation comprising a first capture probe, such that the first capture probe and the 
2Q modified target nucleic acid form a complex, wherein the microspheres are distributed on the surface, 
and detecting the presence fo the target nucleic acid. In addition the method comprises adding at 
least one decoding binding ligand to the array such that the identity of the target nucleic acid is 
determined. Preferably the adapter nucleic acids include a sequence as set forth in Table Table I, 
Table II, Table 111 or Table IV. 

25 

In addition the invention provides a method of making an array. The method comprises forming a 
surface comprising individual sites on a substrate, distributing microspheres on the surface such that 
the individual sites contain microspheres, wherein the microspheres comprise at least a first and a 
second subpopulation each comprising a capture probe, wherein the capture probe is complementary 
30 to an adapter sequence, the adapter sequence joined to a target nucleic acid, and an identifier binding 
ligand that will bind at least one decoder binding ligand such that the identification of the target nucleic 
acid is elucidated. Preferably the adapter nucleic acids include a sequence as set forth in Table I, 
Table II, Table III or Table IV. 

35 In addition the invention provides a kit comprising at least one nucleic acid selected from the group 

consisting of the sequences set forth it Table I, Table II, Table III or Table IV. In one embodiment the 
invention provides a kit that includes a nucleic acid that includes a sequence as set forth in Table I, 
Table II, Table III or Table IV and at least a first universal priming sequence. 
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In addition the invention includes an array composition comprising a first population of microspheres 
comprising first and second subpopulations, wherein the first subpopulation includes a first nucleic 
acid selected from the sequences set forth in Table I, Table II, Table III or Table IV and the second 
subpopulation includes a second sequence selected from the sequences set forth in Table I, Table II, 
5 Table III or Table IV. 

In addition the invention includes an array composition comprising a first sequence at a known location 
on a substrate, wherein the first sequence is selected from the sequences set forth in Table I, Table II, 
Table III or Table IV. 

10 

In addition the invention includes a method for making an array. The method includes distributing a 
population of microspheres on an substrate, wherein the population includes first and second 
subpopulations, wherein the first subpopulation includes a first sequence selected from the group 
consisting of the sequences set forth in Table I, Table II, Table III or Table IV and the second 
15 subpopulation includes a second sequence selected from the group consisting of the sequences set 
forth in Table I, Table II, Table III or Table IV. 

In addition the method includes a method of immobilizing a target nucleic acid. The method includes 
hybridizing a first adapter probe with a first target nucleic acid, wherein the first adapter probe 
20 comprises a first domain that is complementary to the first target nucleic acid and a second domain, 
comprising a first sequence selected from the sequences set forth in Table I, Table II, Table III or 
Table IV to form a first hybridization complex. In addition the method includes contacting the first 
hybridization complex with a first capture probe immobilized on a first substrate, wherein the first 
capture probe is substantially complementary to the second domain of the first adapter probe. 

25 

In addition the invention includes a method of decoding an array composition comprising providing an 
array composition that includes a substrate with a surface comprising discrete sites and a population 
of microspheres comprising at least a first and a second subpopulation, wherein each subpopulation 
comprises a bioactive agent. The microspheres are distributed on the surface. The method further 
30 includes adding a plurality of decoding binding ligands to the array composition to identify the location 
of at least a plurality of the bioactive agents wherein at least a first decoder binding ligand comprises 
a sequence selected from the group consisting of the sequences of Table I, Table II, Table III or Table 
IV. 

35 A method of detecting a target nucleic acid sequence, said method comprising attaching a first 

adapter nucleic acid to a first target nucleic acid sequence to form a modified first target nucleic acid 
sequence, wherein the first adapter nucleic acid includes a sequence selected from the sequences set 
forth in Table I, Table II, Table III or Table IV. The method further includes contacting the modified 
first target nucleic acid sequence with an array comprising a substrate with a patterned surface 
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comprising discrete sites and a population of microspheres comprising at least a first subpopulation 
comprising a first capture probe, such that the first capture probe and the modified first target nucleic 
acid sequence form a hybridization complex; wherein the microspheres are distributed on the surface 
and detecting the presence of the modified first target nucleic acid sequence. 



DETAILED DESCRIPTION OF THE FIGURES 
Figure 1 depicts a method of selecting oligonucleotide sequences. 

Figure 2 depicts a scheme for selection of probes and decoder oligonucleotides. 

Figure 3 demonstrates hybridization intensity comparison of immobilized beads using non-purified 
oligonucleotides with HPLC purified oligonucleotides. 

Figure 4 depicts different oligonucleotide sequences immobilized onto silica beads at various salt 
concentration. Average intensity indicates hybridization intensity of beads in a BeadArray. 

Figure 5 depicts immobilization of oligonucleotides in increasing salt concentrations. 

DETAILED DESCRIPTION OF THE INVENTION 

This invention is directed to the use of adapter sequences, and optionally capture extender probes, 
that allow the use of "universal" arrays. That is, a "universal" array is an array with a set of capture 
probes that will hybridize to adapter sequences, for use in any number of different reactions, including 
the binding of nucleic acid reactions and other target analytes comprising a nucleic acid adapter 
sequence that can hybridize to the array. In this way, a manufacturer of arrays can make one type of 
array that may be used in a variety of applications, thus reducing the manufacturing costs associated 
with the array. In addition, in the case of bead arrays, the decoding steps as outlined below can be 
simplified, as one set of decoding probes can be made. 

In general, the use of adapter sequences can be described as follows for nucleic acid reactions. An 
adapter sequence can be added exogenously to a target nucleic acid sequence using any number of 
different techniques, including, but not limited to, amplification reactions as described in U.S.S.N. 
09/425,633, filed October 22, 1999; 09/513,362, filed February 25, 2000; 09/517,945, filed March 3, 
2000; 09/535,854, filed March 27, 2000; 09/553,993, filed April 20, 2000; 09/556,463, filed April 21, 
2000; 60/135,051, filed May 20, 1999; 60/135,053, filed May 20, 1999; 60/135,123, filed May 20, 1999; 
60/130,089, filed April 20, 1999; 60/160,917, filed October 22, 1999; 60/160,927, filed October 22, 
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1999; 60/161,148, filed October 22, 1999; and 60/244,119, filed October 26, 2000 all of which are 
hereby incorporated by reference. In addition, the adapter can be added to an extension probe. The 
adapter sequence can then be used to target to its complementary capture probe on the surface. 

5 Alternatively, the adapter sequences can be added to other target analytes, to generate unique and 
reproducible arrays of target analytes in a similar manner. By adding the nucleic acid to the target 
analyte (for example to an antibody in an immunoassay), the target analytes may then be arrayed. 

Accordingly, the present invention provides methods for the detection of target analytes, particularly 
1 0 nucleic acid target sequences, in a sample. As will be appreciated by those in the art, the sample 

solution may comprise any number of things, including, but not limited to, bodily fluids (including, but 
not limited to, blood, urine, serum, lymph, saliva, anal and vaginal secretions, perspiration and semen, 
of virtually any organism, with mammalian samples being preferred and human samples being 
particularly preferred); environmental samples (including, but not limited to, air, agricultural, water and 
1 5 soil samples); biological warfare agent samples; research samples; purified samples, such as purified 
genomic DNA, RNA, proteins, etc.; raw samples (bacteria, virus, genomic DNA, etc.; As will be 
appreciated by those in the art, virtually any experimental manipulation may have been done on the 
sample. 

20. The present invention provides methods for the detection of target analytes, particularly nucleic acid 
target sequences, in a sample. By "target analyte" or "analyte" or grammatical equivalents herein is 
meant any molecule, compound or particle to be detected. As outlined below, target analytes 
preferably bind to binding ligands, as is more fully described below. As will be appreciated by those in 
the art, a large number of analytes may be detected using the present methods; basically, any target 

25 analyte for which a binding ligand, described below, may be made may be detected using the methods 
of the invention. 

Suitable analytes include organic and inorganic molecules, including biomolecules. In a preferred 
embodiment, the analyte may be an environmental pollutant (including pesticides, insecticides, toxins, 

30 etc.); a chemical (including solvents, polymers, organic materials, etc.); therapeutic molecules 
(including therapeutic and abused drugs, antibiotics, etc.); biomolecules (including hormones, 
cytokines, proteins, lipids, carbohydrates, cellular membrane antigens and receptors (neural, 
hormonal, nutrient, and cell surface receptors) or their ligands, etc); whole cells (including procaryotic 
(such as pathogenic bacteria) and eukaryotic cells, including mammalian tumor cells); viruses 

35 (including retroviruses, herpesviruses, adenoviruses, lentiviruses, etc.); and spores; etc. Particularly 
preferred analytes are environmental pollutants; nucleic acids; proteins (including enzymes, 
antibodies, antigens, growth factors, cytokines, etc); therapeutic and abused drugs; cells; and viruses. 

In a preferred embodiment, the target analyte is a protein. As will be appreciated by those in the art, 
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there are a large number of possible proteinaceous target analytes that may be detected using the 
present invention. By "proteins" or grammatical equivalents herein is meant proteins, oligopeptides 
and peptides, derivatives and analogs, including proteins containing non-naturally occurring amino 
acids and amino acid analogs, and peptidomimetic structures. The side chains may be in either the 
5 (R) or the (S) configuration. In a preferred embodiment, the amino acids are in the (S) or 

L-configuration. As discussed below, when the protein is used as a binding ligand, it may be desirable 
to utilize protein analogs to retard degradation by sample contaminants. 

Suitable protein target analytes include, but are not limited to, (1) immunoglobulins, particularly IgEs, 

10 IgGs and IgMs, and particularly therapeutically or diagnostically relevant antibodies, including but not 
limited to, for example, antibodies to human albumin, apolipoproteins (including apolipoprotein E), 
human chorionic gonadotropin, Cortisol, a-fetoprotein, thyroxin, thyroid stimulating hormone (TSH), 
antithrombin, antibodies to pharmaceuticals (including antieptileptic drugs (phenytoin, primidone, 
carbariezepin, ethosuximide, valproic acid, and phenobarbitol), cardioactive drugs (digoxin, lidocaine, 

15 procainamide, and disopyramide), bronchodilators (theophylline), antibiotics (chloramphenicol, 
sulfonamides), antidepressants, immunosuppresants, abused drugs (amphetamine, 
methamphetamine, cannabinoids, cocaine and opiates) and antibodies to any number of viruses 
(including orthomyxoviruses, (e.g. influenza virus), paramyxoviruses (e.g respiratory syncytial virus, 
mumps virus, measles virus), adenoviruses, rhinoviruses, coronaviruses, reoviruses, togaviruses (e.g. 

2D rubella virus), parvoviruses, poxviruses (e.g. variola virus, vaccinia virus), enteroviruses (e.g. 

poliovirus, coxsackievirus), hepatitis viruses (including A, B and C), herpesviruses (e.g. Herpes 
simplex virus, varicella-zoster virus, cytomegalovirus, Epstein-Barr virus), rotaviruses, Norwalk 
viruses, hantavirus, arenavirus, rhabdovirus (e.g. rabies virus), retroviruses (including HIV, HTLV-I and 
-II), papovaviruses (e.g. papillomavirus), polyomaviruses, and picornaviruses, and the like), and 

25 bacteria (including a wide variety of pathogenic and non-pathogenic prokaryotes of interest including 
Bacillus; Vibrio, e.g. V. cholerae; Escherichia, e.g. Enterotoxigenic E. coli, Shigella, e.g. S. 
dysenteriae; Salmonella, e.g. S. typhi; Mycobacterium e.g. M tuberculosis, M. leprae; Clostridium, e.g. 
C. botulinum, C. tetani, C. difficile, C.perfringens; Cornyebacterium, e.g. C. diphtheriae; Streptococcus, 
S. pyogenes, S. pneumoniae; Staphylococcus, e.g. S. aureus; Haemophilus, e.g. H. influenzae; 

30 Neisseria, e.g. N. meningitidis, N. gonorrhoeae; Yersinia, e.g. G. lambliaY. pestis, Pseudomonas, e.g. 
P. aeruginosa, P. putida; Chlamydia, e.g. C. trachomatis; Bordetella, e.g. B. pertussis; Treponema, 
e.g. 7. palladium; and the like); (2) enzymes (and other proteins), including but not limited to, enzymes 
used as indicators of or treatment for heart disease, including creatine kinase, lactate dehydrogenase, 
aspartate amino transferase, troponin T, myoglobin, fibrinogen, cholesterol, triglycerides, thrombin, 

35 tissue plasminogen activator (tPA); pancreatic disease indicators including amylase, lipase, 

chymotrypsin and trypsin; liver function enzymes and proteins including cholinesterase, bilirubin, and 
alkaline phosphotase; aldolase, prostatic acid phosphatase, terminal deoxynucleotidyl transferase, and 
bacterial and viral enzymes such as HIV protease; (3) hormones and cytokines (many of which serve 
as iigands for cellular receptors) such as erythropoietin (EPO), thrombopoietin (TPO), the interleukins 



9 



(including IL-1 through IL-17), insulin, insulin-like growth factors (including IGF-1 and -2), epidermal 
growth factor (EGF), transforming growth factors (including TGF-oc and TGF-P), human growth 
hormone, transferrin, epidermal growth factor (EGF), low density lipoprotein, high density lipoprotein, 
leptin, VEGF, PDGF, ciliary neurotrophic factor, prolactin, adrenocorticotropic hormone (ACTH), 
5 calcitonin, human chorionic gonadotropin, cotrisol, estradiol, follicle stimulating hormone (FSH), 

thyroid-stimulating hormone (TSH), leutinzing hormone (LH), progeterone, testosterone, ; and (4) other 
proteins (including a-fetoprotein, carcinoembryonic antigen CEA. 

In addition, any of the biomolecules for which antibodies may be detected may be detected directly as 
10 well; that is, detection of virus or bacterial cells, therapeutic and abused drugs, etc., may be done 
directly. 

Suitable target analytes include carbohydrates, including but not limited to, markers for breast cancer 
(CA15-3, CA 549, CA 27.29), mucin-iike carcinoma associated antigen (MCA), ovarian cancer 
15 (CA125), pancreatic cancer (DE-PAN-2), and colorectal and pancreatic cancer (CA 19, CA 50, 
CA242). 

In a preferred embodiment, the target analyte (and various adapters and other probes of the 
invention), comprise nucleic acids. By "nucleic acid" or "oligonucleotide" or grammatical equivalents 

2Cr herein means at least two nucleotides covalently linked together. A nucleic acid of the present 

invention will generally contain phosphodiester bonds, although in some cases, as outlined below, 
nucleic acid analogs are included that may have alternate backbones, comprising, for example, 
phosphoramide (Beaucage etal., Tetrahedron 49(1 0):1 925 (1993) and references therein; Letsinger, 
J. Org. Chem. 35:3800 (1970); Sprinzl etal., Eur. J. Biochem. 81:579 (1977); Letsinger et al., Nucl. 

25 Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al., J. Am. Chem. Soc. 

110:4470 (1988); and Pauwels et al., Chemica Scripta 26:141 91986)), phosphorothioate (Mag et al., 
Nucleic Acids Res. 19:1437 (1991); and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. 
Am. Chem. Soc. 1 1 1 :2321 (1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides 
and Analogues: A Practical Approach, Oxford University Press), and peptide nucleic acid backbones 

30 and linkages (see Egholm, J. Am. Chem. Soc. 114:1895 (1992); Meier etal., Chem. Int. Ed. Engl. 

31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson etal., Nature 380:207 (1996), all of which 
are incorporated by reference). Other analog nucleic acids include those with positive backbones 
(Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic backbones (U.S. Patent Nos. 
5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Kiedrowshi et al., Angew. Chem. Inti. Ed. 

35 English 30:423 (1 991); Letsinger et al., J. Am. Chem. Soc. 1 1 0:4470 (1 988); Letsinger et al., 
Nucleoside & Nucleotide 13:1597 (1994); Chapters 2 and 3, ASC Symposium Series 580, 
"Carbohydrate Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker 
et al., Bioorganic & Medicinal Chem. Lett. 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 
(1994); Tetrahedron Lett. 37:743 (1996)) and non-ribose backbones, including those described in U.S. 
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Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, 
"Carbohydrate Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic 
acids containing one or more carbocyclic sugars are also included within the definition of nucleic acids 
(see Jenkins et al., Chem. Soc. Rev. (1995) pp169-176). Several nucleic acid analogs are described 
5 in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby expressly 

incorporated by reference. These modifications of the ribose-phosphate backbone may be done to 
facilitate the addition of labels, alter the hybridization properties of the nucleic acids, or to increase the 
stability and half-life of such molecules in physiological environments. 

10 As will be appreciated by those in the art, all of these nucleic acid analogs may find use in the present 
invention. In addition, mixtures of naturally occurring nucleic acids and analogs can be made. 
Alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occuring nucleic 
acids and analogs may be made. 

15 Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic acid analogs. 
~ These backbones are substantially non-ionic under neutral conditions, in contrast to the highly charged 
phosphodiester backbone of naturally occurring nucleic acids. This results in two advantages. First, 
the PNA backbone exhibits improved hybridization kinetics. PNAs have larger changes in the melting 
I temperature (Tm) for mismatched versus perfectly matched basepairs. DNA and RNA typically exhibit 
20 a 2-4°C drop in Tm for an internal mismatch. With the non-ionic PNA backbone, the drop is closer to 
7-9°C. This allows for better detection of mismatches. Similarly, due to their non-ionic nature, 
hybridization of the bases attached to these backbones is relatively insensitive to salt concentration. 

, The nucleic acids may be single stranded or double stranded, as specified, or contain portions of both 
25"- double stranded or single stranded sequence. The nucleic acid may be DNA, both genomic and 
cDNA, RNA or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribo- 
nucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, 
inosine, xathanine hypoxathanine, isocytosine, isoguanine, etc. A preferred embodiment utilizes 
isocytosine and isoguanine in nucleic acids designed to be complementary to other probes, rather than 
30 target sequences, as this reduces non-specific hybridization, as is generally described in U.S. Patent 
No. 5,681 ,702. As used herein, the term "nucleoside" includes nucleotides as well as nucleoside and 
nucleotide analogs, and modified nucleosides such as amino modified nucleosides. In addition, 
"nucleoside" includes non-naturally occuring analog structures. Thus for example the individual units 
of a peptide nucleic acid, each containing a base, are referred to herein as a nucleoside. 

35 

In general, probes of the present invention (including adapter sequences and capture probes, 
described below) are designed to be complementary to a target sequence (either the target sequence 
of the sample or to other probe sequences, for example adapter sequences) such that hybridization of 
the target and the probes of the present invention occurs. This complementarity need not be perfect; 
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there may be any number of base pair mismatches that will interfere with hybridization between the 
target sequence and the single stranded nucleic acids of the present invention. However, if the 
number of mutations is so great that no hybridization can occur under even the least stringent of 
hybridization conditions, the sequence is not a complementary target sequence. Thus, by 
5 "substantially complementary" herein is meant that the probes are sufficiently complementary to the 
target sequences to hybridize under the selected reaction conditions. 

When nucleic acids are to be detected, they are referred to herein as "target nucleic acids" or "target 
sequences". The term "target sequence" or "target nucleic acid" or grammatical equivalents herein 

10 means a nucleic acid sequence on a single strand of nucleic acid. The target sequence may be a 
portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNA including mRNA and rRNA, or 
others. As is outlined herein, the target sequence may be a target sequence from a sample, or a 
derivative target such as a product of a reaction such as a detection sequence from an Invader™ 
reaction, a ligated probe from an OLA reaction, an extended probe from an SBE reaction, etc. It may 

1 5 be any length, with the understanding that longer sequences are more specific. As will be appreciated 
by those in the art, the complementary target sequence may take many forms. For example, it may be 
contained within a larger nucleic acid sequence, i.e. all or part of a gene or mRNA, a restriction 
fragment of a plasmid or genomic DNA, among others. As is outlined more fully below, probes are 
made to hybridize to target sequences to determine the presence or absence of the target sequence in 

20- a sample. Generally speaking, this term will be understood by those skilled in the art. The target 

sequence may also be comprised of different target domains; for example, a firsttarget domain of the 
sample target sequence may hybridize to a capture probe, a second target domain may hybridize to a 
portion of a label probe, etc. The target domains may be adjacent or separated as indicated. Unless 
specified, the terms "first" and "second" are not meant to confer an orientation of the sequences with 

25 respect to the 5-3' orientation of the target sequence. For example, assuming a 5'-3' orientation of the 
complementary target sequence, the first target domain may be located either 5' to the second 
domain, or 3' to the second domain. In addition, as will be appreciated by those in the art, the probes 
on the surface of the array (e.g. attached to the microspheres) may be attached in either orientation, 
either such that they have a free 3' end or a free 5' end. 

30 

As is more fully outlined below, the target sequence may comprise a position for which sequence 
information is desired, generally referred to herein as the "detection position" or "detection locus". In a 
preferred embodiment, the detection position is a single nucleotide, although in some embodiments, it 
may comprise a plurality of nucleotides, either contiguous with each other or separated by one or more 
35 nucleotides. By "plurality" as used herein is meant at least two. As used herein, the base which 

basepairs with a detection position base in a hybrid is termed a "readout position" or an "interrogation 
position". 

In some embodiments, as is outlined herein, the target sequence may not be the sample target 
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sequence but instead is a product of a reaction herein, sometimes referred to herein as a "secondary" 
or "derivative" target sequence. Thus, for example, in SBE, the extended primer may serve as the 
target sequence; similarly, in invasive cleavage variations, the cleaved detection sequence may serve 
as the target sequence. 

If required, the target sequence is prepared using known techniques. For example, the sample may 
be treated to lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification as needed, as will be appreciated by those in the art. 

Once prepared, the target sequence can be used in a variety of reactions for a variety of reasons. For 
example, in a preferred embodiment, genotyping reactions are done. Similarly, these reactions can 
also be used to detect the presence or absence of a target sequence. Sequencing or amplification 
reactions are also preferred. In addition, in any reaction, quantitation of the amount of a target 
sequence may be done. 

Furthermore, as outlined below for each reaction, many of these techniques may be used in a solution 
based assay, wherein the reaction is done in solution and a reaction product is bound to the array for 
subsequent detection, or in solid phase assays, where the reaction occurs on the surface and is 
detected. 

In general, the present invention provides pairs of capture probes (nucleic acids that are attached to 
addresses on arrays) and adapter sequences (sequences that are either perfectly or substantially 
complementary to the capture probe sequences) that can be used in a wide variety of ways, to 
immobilize target nucleic acids (either primary targets, such as genomic DNA, mRNA or cDNA, or 
secondary targets such as amplicons from a nucleic acid amplification or extension reaction, as 
outlined herein) to the addresses of the array. Thus, all the sequences in the Tables include their 
complements, and either sequence can be used as a capture probe (e.g. spotted onto a surface or 
attached to a microsphere of an array) or as the adapter sequence that binds to the capture probe. 

Accordingly, by "adapter sequences" or "adapters" or grammatical equivalents is meant a nucleic acid 
segment generally non-native or exogenous to a target molecule that is used to immobilize the target 
molecule to a solid support via binding to a capture probe sequence. In a preferred embodiment the 
adapter sequences and capture probes are selected from the sequences set forth in Table I, Table II, 
Table III or Table IV. 

Table I includes the sequence of the preferred 4000 sequences labeled "Decoder (5'-3'Y, and inherent 
in this table are the complementary sequences as well. In addition, the invention includes 
oligonucleotides that are complementary to those depicted in Table 1 . 



13 



Table I! includes the sequence of the preferred adapter/capture probe sequences and their 
complementary sequence. Table 2 depicts a preferred subset of 3172 decoder oligonucleotides and 
their complementary probe oligonucleotides. Accordingly, the invention provides compositions 
comprising a sequence as outlined in Table 2. In addition, the invention provides a composition 
5 comprising a complementary binding pair as outlined in Table 2. 

Table 3 includes a preferred subset of 768 decoder oligonucleotides and complementary probe 
sequences. In some embodiments it may be desirable to include a uniform base at a terminus of the 
oligonucleotide, such as a T at the 5' end as depicted in Table 4. The inclusion of this uniform or 
1 0 constant base facilitates uniform labeling of the oligonucleotides. 

These sequences are used as decoder probes, capture probes or adapter sequences as outlined in 
U.S.S.N. 09/344,526 and PCT/US99/14387, and U.S.S.N.s 60/160,917 and 09/5656,463 all of which 
are expressly incorporated by reference in their entirety. 

15 

As will be appreciated by those in the art, the length of the capture probe/adapter sequences will vary, 
L ~ depending on the desired "strength" of binding and the number of different adapters desired. In a 
7 preferred embodiment, adapter sequences range from about 5 to about 500 basepairs in length, with 
£ from about 8 to about 100 being preferred, and from about 10 to about 50 being particularly preferred. 

20 '-' 

l:j As will be appreciated by those in the art, it is desirable to have adapter sequences that do not have 
4 significant homology to naturally occurring target sequences, to avoid non-specific or erroneous 
'l^ binding of target sequences to the capture probes. Accordingly, preferred embodiments utilize some 
method to select useful adapter sequences. In a preferred embodiment the method is outlined in 
25 V,i Figure 1 . Briefly, random 24-mer (or could be any desired length as outlined herein), sequences were 
^'assembled and subjected to certain defined screening procedures including such steps as requiring 
y that the Tm of each of the sequence be within a pre-defined range. In addition the GC content must 
be balanced with the AT content and the self-complementarity must be minimized. In addition GC 
runs should be minimized, that is, runs of Gs or Cs should be reduced. In addition, decoder (adapter) 
3 to decoder (adapter) complementarity should be reduced so that the adapters do not hybridize with 
each other. Finally, the sequences are screened against a specified genomic database. In a 
preferred embodiment the adapters comprise at least one sequence selected from the sequences in 
Table I, Table II, Table III or Table IV. 

In a preferred embodiment, the adapter sequences are chosen on the basis of a decoding step. As is 
more fully outlined below, a decoding step is used to decode random bead arrays. In this 
embodiment, a set of candidate capture probes is chosen; this may be done in a variety of ways. In a 
preferred embodiment, the sequences are generated randomly, each of a sufficient length to ensure a 
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low probability of occurring naturally. In some embodiments, for example when the array will be used 
with a particular organism's genome (e.g. the human genome, the Drosophila genome, etc. ), the 
sequences are compared to the genome as a first filter, for example to remove sequences that would 
cross hybridize. Additionally, further filtering may be done using well-known methods, such as known 
5 methods for selecting good PCR primers. These techniques generally include steps that remove 

sequences that may have a propensity to form secondary structures or otherwise to cross-hybridize. 
Additionally, sequences that have extremes of melting temperatures can be optionally discarded, 
depending on the planned assay conditions. 

1 0 Once a set of candidate capture probes is obtained, an array comprising the capture probes is made, 
and a matching set of decoding probes comprising the adapter sequences (e.g. the complements of 
the capture probes), as more fully outlined below, is made. Decoding then proceeds. Probes that do 
not hybridize well, for whatever reason, will not decode well, generally due to weak signals, and are 
generally discarded. Probes that cross-hybridize will also not decode well, as they will give ambiguous 

1 5 or mixed decoding signals. Only probes that hybridize sufficiently strongly and specifically will decode. 
Thus, by setting suitable thresholds for signal strength and signal purity, adapter sequences that 
perform according to specified criteria are identified. Additionally, by setting a range on signal 
strength, capture probe/adapter sequence pairs that perform similarly (but hybridize specifically) are 
identified. In a preferred embodiment, decoding reactions are repeated, under a variety of conditions, 

20 to test the robustness of the sequence pair. 

Once identified, the adapter sequences are added to target sequences in a variety of ways, as will be 
appreciated by those in the art. In a preferred embodiment, nucleic acid amplification reactions are 
done, as is generally outlined in "Detection of Nucleic Acid Amplification Reactions Using Bead Arrays" 

25 - and "Sequence Determination of Nucleic Acids using Arrays with Microspheres", both of which were 
filed on October 22, 1999, (U.S.S.N.'s 60/161,148 and 09/425,633, respectively), both of which are 
hereby incorporated by reference in their entirety. These may be either target amplification or signal 
amplification. In general, the techniques can be described as follows. Most amplification techniques 
require one or more primers hybridizing to all or part the target sequence (e.g. that hybridize to a target 

30 domain). The adapter sequences can be added to one or more of the primers (depending on the 
configuration/orientation of the system and need) and the amplification reactions are run. Thus, for 
example, PCR primers comprising at least one adapter sequence (and preferably one on each PCR 
primer) may be used; one or both of the ligation probes of an OLA or LCR reaction may comprise an 
adapter sequence; the sequencing primers for pyrosequencing, single-base extension, reversible 

35 chain termination, etc., reactions may comprise an adapter sequence; either the invader probe or the 
signalling probe of invasive cleavage reactions can comprise an adapter sequence; etc. Similarly, for 
signal detection techniques, the probes may comprise adapter sequences, with preferred methods 
utilizing removal of the unreacted probes. In addition, primers may include universal priming 
sequences. That is, the adapters may additionally contain universal priming sequences for universal 
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amplification of products of any of the reactions described herein. Universal priming sequences are 
further outlined in 09/779376, filed February 7, 2001; 09/779202, filed February 7, 2001; 09/915231 , 
filed July 24, 2001 ; 60/1 8081 0, filed February 7, 2000; and 60/297609, filed June 1 1 , 2001 ; and 
60/31 1 194 filed August 9, 2001 , all of which are expressly incorporated herein by reference. 

In an alternative embodiment, non-nucleic acid reactions are used to add adapter sequences to the 
nucleic acid targets. For example, for the direct detection of non-amplified target sequences (e.g. 
genomic DNA samples, etc. ) on universal arrays, non-amplification methods are required. In this 
embodiment, binding partner pairs or chemical methods may be used. For example, one member of a 
binding partner pair may be attached to the adapter sequence and the other member attached to the 
target sequence. For example, the binding partner be a hapten or antigen, which will bind its binding 
partner. For example, suitable binding partner pairs include, but are not limited to: antigens (such as 
proteins (including peptides)) and antibodies (including fragments thereof (FAbs, etc.)); proteins and 
small molecules, including biotin/streptavidin and digoxygenin and antibodies; enzymes and substrates 
or inhibitors; other protein-protein interacting pairs; receptor-ligands; and carbohydrates and their 
binding partners, are also suitable binding pairs. Nucleic acid - nucleic acid binding proteins pairs are 
also useful. In general, the smaller of the pair is attached to the NTP (or the probe) for incorporation 
into the extension primer. Preferred binding partner pairs include, but are not limited to, biotin (or 
imino-biotin) and streptavidin, digeoxinin and Abs, and Prolinx™ reagents. 

In a preferred embodiment, chemical attachment methods are used. In this embodiment, chemical 
functional groups on each of the target sequences and adapter sequences are used. As is known in 
the art, this may be accomplished in a variety of ways. Preferred functional groups for attachment are 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being particularly 
preferred. Using these functional groups, the two sequences are joined together; for example, amino 
groups on each nucleic acid may be attached, for example using linkers as are known in the art; for 
example, homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200, incorporated herein by reference). 

In a preferred embodiment, aptamers are used in the system. Aptamers are nucleic acids that can be 
made to bind to virtually any target analyte; see Bock et al., Nature 355:564 (1992); Femulok et al., 
Current Op. Chem. Biol. 2:230 (1998); and U.S. Patents 5,270,163, 5,475,096, 5,567,588, 5,595,877, 
5,637,459, 5,683,867,5,705,337, and related patents, hereby incorporated by reference. 

In a preferred embodiment, an array comprising capture probes that hybridize to adapter sequences is 
made, as outlined herein. In one embodiment aptamers, comprising adapter sequences, can be 
added. As will be appreciated by those in the art, the aptamers may be preassociated with their 
binding partners, e.g. target analytes, prior to introduction to the array, or not. In addition, the 
association between the adapter sequences on the aptamers and the capture probes can be made 
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covalent, for example through the use of reactive groups (e.g. psoralen) and appropriate activation. 

In addition, the present invention is directed to the use of adapter sequences to assemble arrays 
comprising other target analytes. 

5 

The adapter sequences may be chosen as outlined above. Preferably the adapters are selected from 
the sequences set forth in Table I, Table II, Table III or Table IV. These adapter sequences can then 
be added to the target analytes using a variety of techniques. In general, as described above, non- 
covalent attachment using binding partner pairs may be done, or covalent attachment using chemical 
10 moieties (including linkers). 

Advantages of using adapters include but are not limited to, for example, the ability to create universal 
arrays. That is, a single array is utilized with each capture probe designed to hybridize with a specific 
adapter. The adapters are joined to any number of target analytes, such as nucleic acids, as is 

1 5 described herein. Thus, the same array is used for vastly different target analytes. Furthermore, 

hybridization of adapters with capture probes results in non-covalent attachment of the target nucleic 
acid to the address of the array (e.g. a microsphere in some embodiments). As such, the target 
nucleic/adapter hybrid is easily removed, and the microsphere/capture probe can be re-used. In 
addition, the construction of kits is greatly facilitated by the use of adapters. For example, arrays or 

20' microspheres can be prepared that comprise the capture probe; the adapters can be packaged along 
with the microspheres for attachment to any target analyte of interest. Thus, one need only attach the 
adapter to the target analyte and disperse on the array for the construction of an array of target 
analytes. 

25 Accordingly the present invention provides kits comprising adapters. Preferably the kits include at 
least 1 nucleic acid sequence as set forth in Table 1 . More preferably the kits include at least 10-25 
nucleic acids, with at least 50 nucleic acids more preferred. Even more preferable are kits that include 
at least 100 nucleic acids with more than 1000 even more preferred and more than 2000 even more 
preferred. 

30 

It should also be noted that the sequences defined herein can also be used in "sandwich" assay 
formats, wherein a capture extender probe comprising a first domain that will hybridize to the capture 
probe and a second domain that has a target specific domain is used. The capture extender probe 
hybridizes both to the target sequence and the capture probe, thereby immobilizing the target 
35 sequence on the array. 

Once the adapter sequences are associated with the target analyte, including target nucleic acids, the 
compositions are added to an array comprising addresses comprising capture probes. In one 
embodiment a plurality of hybrid adapter sequence/target analytes are pooled prior to addition to an 
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array. All of the methods and compositions herein are drawn to compositions and methods for 
detecting the presence of target analytes, particularly nucleic acids, using adapter arrays. 

Accordingly, the present invention provides array compositions comprising at least a first substrate 
5 with a surface comprising individual sites. The present system finds particular utility in array formats, 
i.e. wherein there is a matrix of capture probes (herein generally referred to "pads", "addresses" or 
"micro-locations"). By "array" or "biochip" herein is meant a plurality of nucleic acids in an array format; 
the size of the array will depend on the composition and end use of the array. Nucleic acids arrays are 
known in the art, and can be classified in a number of ways; both ordered arrays (e.g. the ability to 

10 resolve chemistries at discrete sites), and random arrays are included. Ordered arrays include, but 
are not limited to, those made using photolithography techniques (Affymetrix GeneChip™), spotting 
techniques (Synteni and others), printing techniques (Hewlett Packard and Rosetta), three dimensional 
"gel pad" arrays, etc. In one embodiment the ordered arrays include arrays that contain nucleic acids 
at known locations. That is, the adapters or capture probes described herein are immobilized at known 

15 locations on a substrate. By "known" locations is meant a site that is known or has been known. 

In addition, adapters find use "liquid arrays". By "liquid arrays" is meant an array in solution for 
analysis, for example, by flow cytometry. 

20=" A preferred embodiment utilizes microspheres on a variety of substrates including fiber optic bundles, 
as are outlined in PCTs US98/21193, PCT US99/14387 and PCT US98/05025; WO98/50782; and 
U.S.S.N.s 09/287,573, 09/151,877, 09/256,943, 09/316,154, 60/119,323, 09/315,584; all of which are 
expressly incorporated by reference. While much of the discussion below is directed to the use of 
microsphere arrays on fiber optic bundles, any array format of nucleic acids on solid supports may be 

25 . utilized. 

Arrays containing from about 2 different bioactive agents (e.g. different beads, when beads are used) 
to many millions can be made, with very large arrays being possible. Generally, the array will 
comprise from two to as many as a billion or more, depending on the size of the beads and the 

30 substrate, as well as the end use of the array, thus very high density, high density, moderate density, 
low density and very low density arrays may be made. Preferred ranges for very high density arrays 
are from about 10,000,000 to about 2,000,000,000, with from about 100,000,000 to about 
1,000,000,000 being preferred (all numbers being in square cm). High density arrays range about 
1 00,000 to about 1 0,000,000, with from about 1 ,000,000 to about 5,000,000 being particularly 

35 preferred. Moderate density arrays range from about 1 0,000 to about 1 00,000 being particularly 

preferred, and from about 20,000 to about 50,000 being especially preferred. Low density arrays are 
generally less than 10,000, with from about 1 ,000 to about 5,000 being preferred. Very low density 
arrays are less than 1,000, with from about 10 to about 1000 being preferred, and from about 100 to 
about 500 being particularly preferred. In some embodiments, the compositions of the invention may 
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not be in array format; that is, for some embodiments, compositions comprising a single bioactive 
agent may be made as well. In addition, in some arrays, multiple substrates may be used, either of 
different or identical compositions. Thus for example, large arrays may comprise a plurality of smaller 
substrates. 

5 

In addition, one advantage of the present compositions is that particularly through the use of fiber optic 
technology, extremely high density arrays can be made. Thus for example, because beads of 200 [im 
or less (with beads of 200 nm possible) can be used, and very small fibers are known, it is possible to 
have as many as 40,000 or more (in some instances, 1 million) different elements (e.g. fibers and 
10 beads) in a 1 mm 2 fiber optic bundle, with densities of greater than 25,000,000 individual beads and 
fibers (again, in some instances as many as 50-100 million) per 0.5 cm 2 obtainable (4 million per 
square cm for 5 p center-to-center and 100 million per square cm for 1 \i center-to-center). 

By "substrate" or "solid support" or other grammatical equivalents herein is meant any material that 
15 can be modified to contain discrete individual sites appropriate for the attachment or association of 
beads and is amenable to at least one detection method. As will be appreciated by those in the art, 
the number of possible substrates is very large. Possible substrates include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of 
styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 
20- polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and 
modified silicon, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and a variety of 
other polymers. In general, the substrates allow optical detection and do not themselves appreciably 
fluoresce. 

25 Generally the substrate is flat (planar), although as will be appreciated by those in the art, other 

configurations of substrates may be used as well; for example, three dimensional configurations can 
be used, for example by embedding the beads in a porous block of plastic that allows sample access 
to the beads and using a confocal microscope for detection. Similarly, the beads may be placed on 
the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Preferred 

30 substrates include optical fiber bundles as discussed below, and flat planar substrates such as glass, 
polystyrene and other plastics and acrylics. 

In a preferred embodiment, the substrate is an optical fiber bundle or array, as is generally described 
in U.S.S.N.s 08/944,850 and 08/519,062, PCT US98/05025, and PCT US98/09163, all of which are 
35 expressly incorporated herein by reference. Preferred embodiments utilize preformed unitary fiber 
optic arrays. By "preformed unitary fiber optic array" herein is meant an array of discrete individual 
fiber optic strands that are co-axially disposed and joined along their lengths. The fiber strands are 
generally individually clad. However, one thing that distinguished a preformed unitary array from other 
fiber optic formats is that the fibers are not individually physically manipulatable; that is, one strand 
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generally cannot be physically separated at any point along its length from another fiber strand. 

At least one surface of the substrate is modified to contain discrete, individual sites for later 
association of microspheres. These sites may comprise physically altered sites, i.e. physical 
5 configurations such as wells or small depressions in the substrate that can retain the beads, such that 
a microsphere can rest in the well, or the use of other forces (magnetic or compressive), or chemically 
altered or active sites, such as chemically functionalized sites, electrostatically altered sites, 
hydrophobically/hydrophilicallyfunctionalized sites, spots of adhesive, etc. 

1 0 The sites may be a pattern, i.e. a regular design or configuration, or randomly distributed. A preferred 
embodiment utilizes a regular pattern of sites such that the sites may be addressed in the X-Y 
coordinate plane. "Pattern" in this sense includes a repeating unit cell, preferably one that allows a 
high density of beads on the substrate. However, it should be noted that these sites may not be 
discrete sites. That is, it is possible to use a uniform surface of adhesive or chemical functionalities, 

1 5 for example, that allows the attachment of beads at any position. That is, the surface of the substrate 
is modified to allow attachment of the microspheres at individual sites, whether or not those sites are 
contiguous or non-contiguous with other sites. Thus, the surface of the substrate may be modified 
such that discrete sites are formed that can only have a single associated bead, or alternatively, the 
surface of the substrate is modified and beads may go down anywhere, but they end up at discrete 

20 sites. 

In a preferred embodiment, the surface of the substrate is modified to contain wells, i.e. depressions in 
the surface of the substrate. This may be done as is generally known in the art using a variety of 
techniques, including, but not limited to, photolithography, stamping techniques, molding techniques 
25 and microetching techniques. As will be appreciated by those in the art, the technique used will 
depend on the composition and shape of the substrate. 

In a preferred embodiment, physical alterations are made in a surface of the substrate to produce the 
sites. In a preferred embodiment, the substrate is a fiber optic bundle and the surface of the substrate 

30 is a terminal end of the fiber bundle, as is generally described in 08/818,199 and 09/151,877, both of 
which are hereby expressly incorporated by reference. In this embodiment, wells are made in a 
terminal or distal end of a fiber optic bundle comprising individual fibers. In this embodiment, the 
cores of the individual fibers are etched, with respect to the cladding, such that small wells or 
depressions are formed at one end of the fibers. The required depth of the wells will depend on the 

35 size of the beads to be added to the wells. 

Generally in this embodiment, the microspheres are non-covalently associated in the wells, although 
the wells may additionally be chemically functionalized as is generally described below, cross-linking 
agents may be used, or a physical barrier may be used, i.e. a film or membrane over the beads. 
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In a preferred embodiment, the surface of the substrate is modified to contain chemically modified 
sites, that can be used to attach, either covalently or non-covalently, the microspheres of the invention 
to the discrete sites or locations on the substrate. "Chemically modified sites" in this context includes, 
but is not limited to, the addition of a pattern of chemical functional groups including amino groups, 
5 carboxy groups, oxo groups and thiol groups, that can be used to covalently attach microspheres, 
which generally also contain corresponding reactive functional groups; the addition of a pattern of 
adhesive that can be used to bind the microspheres (either by prior chemical functionalization for the 
addition of the adhesive or direct addition of the adhesive); the addition of a pattern of charged groups 
(similar to the chemical functionalities) for the electrostatic attachment of the microspheres, i.e. when 

10 the microspheres comprise charged groups opposite to the sites; the addition of a pattern of chemical 
functional groups that renders the sites differentially hydrophobic or hydrophilic, such that the addition 
of similarly hydrophobic or hydrophilic microspheres under suitable experimental conditions will result 
in association of the microspheres to the sites on the basis of hydroaffinity. For example, the use of 
hydrophobic sites with hydrophobic beads, in an aqueous system, drives the association of the beads 

1 5 preferentially onto the sites. As outlined above, "pattern" in this sense includes the use of a uniform 

treatment of the surface to allow attachment of the beads at discrete sites, as well as treatment of the 
= surface resulting in discrete sites. As will be appreciated by those in the art, this may be accomplished 
in a variety of ways. 

20 In a preferred embodiment, the compositions of the invention further comprise a population of 
microspheres. By "population" herein is meant a plurality of beads as outlined above for arrays. 
Within the population are separate subpopulations, which can be a single microsphere or multiple 
identical microspheres. That is, in some embodiments, as is more fully outlined below, the array may 
contain only a single bead for each capture probe; preferred embodiments utilize a plurality of beads of 

25 each type. 

By "microspheres" or "beads" or "particles" or grammatical equivalents herein is meant small discrete 
particles. The composition of the beads will vary, depending on the class of capture probe and the 
method of synthesis. Suitable bead compositions include those used in peptide, nucleic acid and 
30 organic moiety synthesis, including, but not limited to, plastics, ceramics, glass, polystyrene, 

methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium dioxide, 
latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon 
may all be used. "Microsphere Detection Guide" from Bangs Laboratories, Fishers IN is a helpful 
guide. 

35 

The beads need not be spherical; irregular particles may be used. In addition, the beads may be 
porous, thus increasing the surface area of the bead available for either capture probe attachment or 
tag attachment The bead sizes range from nanometers, i.e. 100 nm, to millimeters, i.e. 1 mm, with 
beads from about 0.2 micron to about 200 microns being preferred, and from about 0.5 to about 5 
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micron being particularly preferred, although in some embodiments smaller beads may be used. 

It should be noted that a key component of this embodiment of the invention is the use of a 
substrate/bead pairing that allows the association or attachment of the beads at discrete sites on the 
5 surface of the substrate, such that the beads do not move during the course of the assay. 

Each microsphere comprises a capture probe, although as will be appreciated by those in the art, 
there may be some microspheres which do not contain a capture probe, depending on the synthetic 
methods. Alternatively, some have more than one capture probe. 

10 

Attachment of the nucleic acids may be done in a variety of ways, as will be appreciated by those in 
the art, including, but not limited to, chemical or affinity capture (for example, including the 
incorporation of derivatized nucleotides such as AminoLink or biotinylated nucleotides that can then be 
used to attach the nucleic acid to a surface, as well as affinity capture by hybridization), cross-linking, 

15 and electrostatic attachment, etc. In a preferred embodiment, affinity capture is used to attach the 
nucleic acids to the beads. For example, nucleic acids can be derivatized, for example with one 
member of a binding pair, and the beads derivatized with the other member of a binding pair. Suitable 
L binding pairs are as described herein for IBL/DBL pairs. For example, the nucleic acids may be 
biotinylated (for example using enzymatic incorporate of biotinylated nucleotides, for by 

20. photoactivated cross-linking of biotin). Biotinylated nucleic acids can then be captured on streptavidin- 
-. coated beads, as is known in the art. Similarly, other hapten-receptor combinations can be used, such 
as digoxigenin and anti-digoxigenin antibodies. Alternatively, chemical groups can be added in the 
form of derivatized nucleotides, that can them be used to add the nucleic acid to the surface. 

25 Preferred attachments are covalent, although even relatively weak interactions (i.e. non-covalent) can 
f be sufficient to attach a nucleic acid to a surface, if there are multiple sites of attachment per each 

nucleic acid. Thus, for example, electrostatic interactions can be used for attachment, for example by 
having beads carrying the opposite charge to the bioactive agent. 

30 Similarly, affinity capture utilizing hybridization can be used to attach nucleic acids to beads. For 

example, as is known in the art, polyA+RNA is routinely captured by hybridization to oligo-dT beads; 
this may include oligo-dT capture followed by a cross-linking step, such as psoralen crosslinking). If 
the nucleic acids of interest do not contain a poIyA tract, one can be attached by polymerization with 
terminal transferase, or via ligation of an oligoA linker, as is known in the art. 

35 

Alternatively, chemical crosslinking may be done, for example by photoactivated crosslinking of 
thymidine to reactive groups, as is known in the art. 

In a preferred embodiment, each bead comprises a single type of capture probe, although a plurality of 
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individual capture probes are preferably attached to each bead. Similarly, preferred embodiments 
utilize more than one microsphere containing a unique capture probe; that is, there is redundancy built 
into the system by the use of subpopulations of microspheres, each microsphere in the subpopulation 
containing the same capture probe. 

5 

In an alternative embodiment, each bead comprises a plurality of different capture probes. 

As will be appreciated by those in the art, the capture probes may either be synthesized directly on the 
beads, or they may be made and then attached after synthesis. In a preferred embodiment, linkers 
1 0 are used to attach the capture probes to the beads, to allow both good attachment, sufficient flexibility 
to allow good interaction with the target molecule, and to avoid undesirable binding reactions. 

In a preferred embodiment, the capture probes are synthesized directly on the beads. As is known in 
the art, many classes of chemical compounds are currently synthesized on solid supports, such as 
1 5 peptides, organic moieties, and nucleic acids. It is a relatively straightforward matter to adjust the 
current synthetic techniques to use beads. 

In a preferred embodiment, the capture probes are synthesized first, and then covalently attached to 
the beads. As will be appreciated by those in the art, this will be done depending on the composition 
20 of the capture probes and the beads. The functionalization of solid support surfaces such as certain 
-1 polymers with chemically reactive groups such as thiols, amines, carboxyls, etc. is generally known in 
~ the art. Accordingly, "blank" microspheres may be used that have surface chemistries that facilitate 
"I the attachment of the desired functionality by the user. Some examples of these surface chemistries 
^ for blank microspheres include, but are not limited to, amino groups including aliphatic and aromatic 
25. amines, carboxylic acids, aldehydes, amides, chloromethyl groups, hydrazide, hydroxyl groups, 
sulfonates and sulfates. 

In a preferred embodiment the attachment of nucleic acids to substrates includes contacting the 
oligonucleotide and the solid support in the presence of high salt concentrations. As is appreciated by 
30 those skilled in the art, salt includes, but is not limited to sodium chloride, potassium chloride, calcium 
chloride, magnesium chloride, lithium chloride, rubidium chloride, cesium chloride, barium chloride and 
the like. In a preferred embodiment, salt as used in the invention includes sodium chloride. 

By high salt concentrations is meant salt that is more concentrated than about 0.1 M salt. In a 
35 preferred embodiment, by high salt concentrations is meant greater than about 0.2 M salt. In a 

particularly preferred embodiment, high salt concentrations include from about 0.5 to 3M salt, with 
about 1M to 2M being most preferred. 

By solid support or other grammatical equivalents herein is meant any material that can be modified 
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to contain oligonucleotides. As will be appreciated by those in the art, the number of possible solid 
supports is very large. Possible solid supports include, but are not limited to beads, glass and 
modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene 
and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 
5 polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and 
modified silicon, carbon, metals, inorganic glasses, plastics, optical fiber bundles, and a variety of 
other polymers. 

Once formed, the support containing the oligonucleotides finds use in a variety of systems including 
10 decoding arrays as described in more detail in U.S.S.N. 09/344,526, and U.S.S.N. 09/574, 117, both of 
which are expressly incorporated herein by reference. In addition, the support containing the 
oligonucleotides finds use in microfluidic systems as described in U.S.S.N. 09/306,369 which is 
expressly incorporated herein by reference. In addition, the support containing the oligonucleotides 
finds use in composite array systems as described in U.S.S.N. 09/606,369, which is expressly 
1 5 incorporated herein by reference. In addition the support containing the oligonucleotides finds use in a 
variety of assays as outlined in more detail in U.S.S.N.s 09/513,362, 09/517,945, 09/535,854, 
60/160,917, 60/180,810, 60/182,955, and 09/566,463, all of which are expressly incorporated herein 
by reference in their entirety. In addition, the support containing the oligonucleotides finds use in array 
based sensors as described in more detail in 09/287,573, 09/260,963, 09/450,829, 09/151 ,877, 
2Q 09/187,289 and 08/519,062, all of which are expressly incorporated herein by reference in their 
entirety. 

Accordingly the invention provides a method of attaching oligonucleotides to a solid support. The 
;\ method includes contacting the oligonucleotides with the support in the presence of high salt as 
25 described herein. Once attached, as discussed in the examples, the attached oligonucleotides readily 
I, . hybridize to targets, probes and the like. Attachment of crude oligonucleotides in the presence of high 
salt is as efficient as attaching purified oligonucleotides. Thus, the invention also contemplates a 
method of attachment of oligonucleotides to a solid support without prior purification of the 
oligonucleotides. Again, the method includes contacting the crude oligonucleotides with a solid 
30 support in the presence of high salt as described herein. 

The capture probes are designed to be substantially complementary to the adapter sequences, to 
allow for a minimum of cross reactivity. 

35 When microsphere arrays are used, an encoding/decoding system must be used. That is, since the 
beads are generally put onto the substrate randomly, there are several ways to correlate the 
functionality on the bead with its location, including the incorporation of unique optical signatures, 
generally fluorescent dyes, that could be used to identify the chemical functionality on any particular 
bead. This allows the synthesis of the candidate agents (i.e. compounds such as nucleic acids and 
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antibodies) to be divorced from their placement on an array, i.e. the candidate agents may be 
synthesized on the beads, and then the beads are randomly distributed on a patterned surface. Since 
the beads are first coded with an optical signature, this means that the array can later be "decoded", 
i.e. after the array is made, a correlation of the location of an individual site on the array with the bead 
5 or candidate agent at that particular site can be made. This means that the beads may be randomly 
distributed on the array, a fast and inexpensive process as compared to either the in situ synthesis or 
spotting techniques of the prior art. 

However, the drawback to these methods is that for a large array, the system requires a large number 

10 of different optical signatures, which may be difficult or time-consuming to utilize. Accordingly, the 

present invention provides several improvements over these methods, generally directed to methods 
of coding and decoding the arrays. That is, as will be appreciated by those in the art, the placement of 
the capture probes is generally random, and thus a coding/decoding system is required to identify the 
probe at each location in the array. This may be done in a variety of ways, as is more fully outlined 

15 below, and generally includes: a) the use a decoding binding ligand (DBL), generally directly labeled, 
that binds to either the capture probe or to identifier binding ligands (IBLs) attached to the beads; b) 
- positional decoding, for example by either targeting the placement of beads (for example by using 
photoactivatible or photocleavable moieties to allow the selective addition of beads to particular 
locations), or by using either sub-bundles or selective loading of the sites, as are more fully outlined 

2f> below; c) selective decoding, wherein only those beads that bind to a target are decoded; or d) 

combinations of any of these. In some cases, as is more fully outlined below, this decoding may occur 
for all the beads, or only for those that bind a particular target sequence. Similarly, this may occur 
either prior to or after addition of a target sequence. In addition, as outlined herein, the target 
sequences detected may be either a primary target sequence (e.g. a patient sample), or a reaction 

25 product from one of the methods described herein (e.g. an extended SBE probe, a ligated probe, a 
cleaved signal probe, etc.). 

Once the identity (i.e. the actual agent) and location of each microsphere in the array has been fixed, 
the array is exposed to samples containing the target sequences, although as outlined below, this can 
30 be done prior to or during the analysis as well. The target sequences can hybridize (either directly or 
indirectly) to the capture probes as is more fully outlined below, and results in a change in the optical 
signal of a particular bead. 

In the present invention, "decoding" may not rely on the use of optical signatures, but rather on the use 
35 of decoding binding ligands that are added during a decoding step. The decoding binding ligands will 
bind either to a distinct identifier binding ligand partner that is placed on the beads, or to the capture 
probe itself. In this embodiment the decoding binding ligand either is complementary to the capture 
probe. In this embodiment the decoding binding ligand has the sequence of the adapter that also 
binds to the capture probe. In a preferred embodiment the decoder binding ligand is a nucleic acid 
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that has the sequence of at least one of the nucleic acids set forth in Table 1 . 



The decoding binding ligands are either directly or indirectly labeled, and thus decoding occurs by 
detecting the presence of the label. By using pools of decoding binding ligands in a sequential 
5 fashion, it is possible to greatly minimize the number of required decoding steps. 

In some embodiments, the microspheres may additionally comprise identifier binding ligands for use in 
certain decoding systems. By "identifier binding ligands" or "IBLs" herein is meant a compound that 
will specifically bind a corresponding decoder binding ligand (DBL) to facilitate the elucidation of the 

1 0 identity of the capture probe attached to the bead. That is, the IBL and the corresponding DBL form a 
binding partner pair. By "specifically bind" herein is meant that the IBL binds its DBL with specificity 
sufficient to differentiate between the corresponding DBL and other DBLs (that is, DBLs for other 
IBLs), or other components or contaminants of the system. The binding should be sufficient to remain 
bound under the conditions of the decoding step, including wash steps to remove non-specific binding. 

15 In some embodiments, for example when the IBLs and corresponding DBLs are proteins or nucleic 
acids, the dissociation constants of the IBL to its DBL will be less than about 10"-10 6 M"\ with less 
than about 1 0 s to 1 0' 9 M" 1 being preferred and less than about 1 0 7 -1 0 9 M 1 being particularly 
preferred. 

20 IBL-DBL binding pairs are known or can be readily found using known techniques. For example, when 
the IBL is a protein, the DBLs include proteins (particularly including antibodies or fragments thereof 
(FAbs, etc.)) or small molecules, or vice versa (the IBL is an antibody and the DBL is a protein). Metal 
ion- metal ion ligands or chelators pairs are also useful. Antigen-antibody pairs, enzymes and 
substrates or inhibitors, other protein-protein interacting pairs, receptor-ligands, complementary 

25 nucleic acids, and carbohydrates and their binding partners are also suitable binding pairs. Nucleic 
acid - nucleic acid binding proteins pairs are also useful. Similarly, as is generally described in U.S. 
Patents 5,270,163, 5,475,096, 5,567,588, 5,595,877, 5,637,459, 5,683,867,5,705,337, and related 
patents, hereby incorporated by reference, nucleic acid "aptamers" can be developed for binding to 
virtually any target; such an aptamer-target pair can be used as the IBL-DBL pair. Similarly, there is a 

30 wide body of literature relating to the development of binding pairs based on combinatorial chemistry 
methods. 

In a preferred embodiment, the IBL is a molecule whose color or luminescence properties change in 
the presence of a selectively-binding DBL. For example, the IBL may be a fluorescent pH indicator 
35 whose emission intensity changes with pH. Similarly, the IBL may be a fluorescent ion indicator, 
whose emission properties change with ion concentration. 

Alternatively, the IBL is a molecule whose color or luminescence properties change in the presence of 
various solvents. For example, the IBL may be a fluorescent molecule such as an ethidium salt whose 
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fluorescence intensity increases in hydrophobic environments. Similarly, the IBL may be a derivative 
of fluorescein whose color changes between aqueous and nonpolar solvents. 

In one embodiment, the DBL may be attached to a bead, i.e. a "decoder bead", that may carry a labe 
5 such as a fluorophore. 



In a preferred embodiment, the IBL-DBL pair comprise substantially complementary single-stranded 
nucleic acids. In this embodiment, the binding ligands can be referred to as "identifier probes" and 
"decoder probes". Generally, the identifier and decoder probes range from about 4 basepairs in length 
to about 1000, with from about 6 to about 100 being preferred, and from about 8 to about 40 being 
particularly preferred. What is important is that the probes are long enough to be specific, i.e. to 
distinguish between different IBL-DBL pairs, yet short enough to allow both a) dissociation, if 
necessary, under suitable experimental conditions, and b) efficient hybridization. 

In a preferred embodiment, as is more fully outlined below, the IBLs do not bind to DBLs. Rather, the 
IBLs are used as identifier moieties ("IMs") that are identified directly, for example through the use of 
mass spectroscopy. 



Alternatively, in a preferred embodiment, the IBL and the capture probe are the same moiety; thus, for 
20 example, as outlined herein, particularly when no optical signatures are used, the capture probe can 
serve as both the identifier and the agent. For example, in the case of nucleic acids, the bead-bound 
probe (which serves as the capture probe) can also bind decoder probes, to identify the sequence of 
the probe on the bead. Thus, in this embodiment, the DBLs bind to the capture probes. 

25 In one embodiment, the microspheres may contain an optical signature. That is, as outlined in 
U.S.S.N.s 08/818,199 and 09/151,877, previous work had each subpopulation of microspheres 
comprising a unique optical signature or optical tag that is used to identify the unique capture probe of 
that subpopulation of microspheres; that is, decoding utilizes optical properties of the beads such that 
a bead comprising the unique optical signature may be distinguished from beads at other locations 

30 with different optical signatures. Thus the previous work assigned each capture probe a unique optical 
signature such that any microspheres comprising that capture probe are identifiable on the basis of the 
signature. These optical signatures comprised dyes, usually chromophores orfluorophores, that were 
entrapped or attached to the beads themselves. Diversity of optical signatures utilized different 
fluorochromes, different ratios of mixtures of fluorochromes, and different concentrations (intensities) 

35 of fluorochromes. 



In a preferred embodiment, the present invention does not rely solely on the use of optical properties 
to decode the arrays. However, as will be appreciated by those in the art, it is possible in some 
embodiments to utilize optical signatures as an additional coding method, in conjunction with the 
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present system. Thus, for example, as is more fully outlined below, the size of the array may be 
effectively increased while using a single set of decoding moieties in several ways, one of which is the 
use of optical signatures one some beads. Thus, for example, using one "set" of decoding molecules, 
the use of two populations of beads, one with an optical signature and one without, allows the effective 
doubling of the array size. The use of multiple optical signatures similarly increases the possible size 
of the array. 

In a preferred embodiment, each subpopulation of beads comprises a plurality of different IBLs. By 
using a plurality of different IBLs to encode each capture probe, the number of possible unique codes 
is substantially increased. That is, by using one unique IBL per capture probe, the size of the array will 
be the number of unique IBLs (assuming no "reuse" occurs, as outlined below). However, by using a 
plurality of different IBLs per bead, n, the size of the array can be increased to 2 n , when the presence 
or absence of each IBL is used as the indicator. For example, the assignment of 10 IBLs per bead 
generates a 10 bit binary code, where each bit can be designated as "1" (IBL is present) or "0" (IBL is 
absent). A 10 bit binary code has 2 10 possible variants However, as is more fully discussed below, the 
size of the array may be further increased if another parameter is included such as concentration or 
intensity; thus for example, if two different concentrations of the IBL are used, then the array size 
increases as 3 n . Thus, in this embodiment, each individual capture probe in the array is assigned a 
combination of IBLs, which can be added to the beads prior to the addition of the capture probe, after, 
or during the synthesis of the capture probe, i.e. simultaneous addition of IBLs and capture probe 
components. 

Alternatively, the combination of different IBLs can be used to elucidate the sequence of the nucleic 
acid. Thus, for example, using two different IBLs (IBL1 and IBL2), the first position of a nucleic acid 
can be elucidated: for example, adenosine can be represented by the presence of both IBL1 and IBL2; 
thymidine can be represented by the presence of IBL1 but not IBL2, cytosine can be represented by 
the presence of IBL2 but not IBL1 , and guanosine can be represented by the absence of both. The 
second position of the nucleic acid can be done in a similar manner using IBL3 and IBL4; thus, the 
presence of IBL1 , IBL2, IBL3 and IBL4 gives a sequence of AA; IBL1 , IBL2, and IBL3 shows the 
sequence AT; IBL1 , IBL3 and iBL4 gives the sequence TA, etc. The third position utilizes IBL5 and 
1BL6, etc. In this way, the use of 20 different identifiers can yield a unique code for every possible 10- 
mer. 

In this way, a sort of "bar code" for each sequence can be constructed; the presence or absence of 
each distinct IBL will allow the identification of each capture probe. 

In addition, the use of different concentrations or densities of IBLs allows a "reuse" of sorts. If, for 
example, the bead comprising a first agent has a 1X concentration of IBL, and a second bead 
comprising a second agent has a 10X concentration of IBL, using saturating concentrations of the 
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corresponding labelled DBL allows the user to distinguish between the two beads. 

Once the microspheres comprising the capture probes are generated, they are added to the substrate 
to form an array. It should be noted that while most of the methods described herein add the beads to 
the substrate prior to the assay, the order of making, using and decoding the array can vary. For 
example, the array can be made, decoded, and then the assay done. Alternatively, the array can be 
made, used in an assay, and then decoded; this may find particular use when only a few beads need 
be decoded. Alternatively, the beads can be added to the assay mixture, i.e. the sample containing 
the target sequences, prior to the addition of the beads to the substrate; after addition and assay, the 
array may be decoded. This is particularly preferred when the sample comprising the beads is 
agitated or mixed; this can increase the amount of target sequence bound to the beads per unit time, 
and thus (in the case of nucleic acid assays) increase the hybridization kinetics. This may find 
particular use in cases where the concentration of target sequence in the sample is low; generally, for 
low concentrations, long binding times must be used. 

In general, the methods of making the arrays and of decoding the arrays is done to maximize the 
number of different candidate agents that can be uniquely encoded. The compositions of the invention 
may be made in a variety of ways. In general, the arrays are made by adding a solution or slurry 
comprising the beads to a surface containing the sites for attachment of the beads. This may be done 
in a variety of buffers, including aqueous and organic solvents, and mixtures. The solvent can 
evaporate, and excess beads are removed. 

In a preferred embodiment, when non-covalent methods are used to associate the beads with the 
array, a novel method of loading the beads onto the array is used. This method comprises exposing 
the array to a solution of particles (including microspheres and cells) and then applying energy, e.g. 
agitating or vibrating the mixture. This results in an array comprising more tightly associated particles, 
as the agitation is done with sufficient energy to cause weakly-associated beads to fall off (or out, in 
the case of wells). These sites are then available to bind a different bead. In this way, beads that 
exhibit a high affinity for the sites are selected. Arrays made in this way have two main advantages as 
compared to a more static loading: first of all, a higher percentage of the sites can be filled easily, and 
secondly, the arrays thus loaded show a substantial decrease in bead loss during assays. Thus, in a 
preferred embodiment, these methods are used to generate arrays that have at least about 50% of the 
sites filled, with at least about 75% being preferred, and at least about 90% being particularly 
preferred. Similarly, arrays generated in this manner preferably lose less than about 20% of the beads 
during an assay, with less than about 10% being preferred and less than about 5% being particularly 
preferred. 

In this embodiment, the substrate comprising the surface with the discrete sites is immersed into a 
solution comprising the particles (beads, cells, etc.). The surface may comprise wells, as is described 
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herein, or other types of sites on a patterned surface such that there is a differential affinity for the 
sites. This differnetial affinity results in a competitive process, such that particles that will associate 
more tightly are selected. Preferably, the entire surface to be "loaded" with beads is in fluid contact 
with the solution. This solution is generally a slurry ranging from about 10,000:1 beads:solution 
(vohvol) to 1 :1 . Generally, the solution can comprise any number of reagents, including aqueous 
buffers, organic solvents, salts, other reagent components, etc. In addition, the solution preferably 
comprises an excess of beads; that is, there are more beads than sites on the array. Preferred 
embodiments utilize two-fold to billion-fold excess of beads. 

The immersion can mimic the assay conditions; for example, if the array is to be "dipped" from above 
into a microtiter plate comprising samples, this configuration can be repeated for the loading, thus 
minimizing the beads that are likely to fall out due to gravity. 

Once the surface has been immersed, the substrate, the solution, or both are subjected to a 
competitive process, whereby the particles with lower affinity can be disassociated from the substrate 
and replaced by particles exhibiting a higher affinity to the site. This competitive process is done by 
the introduction of energy, in the form of heat, sonication, stirring or mixing, vibrating or agitating the 
solution or substrate, or both. 



A preferred embodiment utilizes agitation or vibration. In general, the amount of manipulation of the 
substrate is minimized to prevent damage to the array; thus, preferred embodiments utilize the 
agitation of the solution rather than the array, although either will work. As will be appreciated by those 
in the art, this agitation can take on any number of forms, with a preferred embodiment utilizing 
microtiter plates comprising bead solutions being agitated using microtiter plate shakers. 

The agitation proceeds for a period of time sufficient to load the array to a desired fill. Depending on 
the size and concentration of the beads and the size of the array, this time may range from about 1 
second to days, with from about 1 minute to about 24 hours being preferred. 

It should be noted that not all sites of an array may comprise a bead; that is, there may be some sites 
on the substrate surface which are empty. In addition, there may be some sites that contain more 
than one bead, although this is not preferred. 

In some embodiments, for example when chemical attachment is done, it is possible to attach the 
beads in a non-random or ordered way. For example, using photoactivatible attachment linkers or 
photoactivatible adhesives or masks, selected sites on the array may be sequentially rendered suitable 
for attachment, such that defined populations of beads are laid down. 

The arrays of the present invention are constructed such that information about the identity of the 



30 



capture probe is built into the array, such that the random deposition of the beads in the fiber wells can 
be "decoded" to allow identification of the capture probe at all positions. This may be done in a variety 
of ways, and either before, during or after the use of the array to detect target molecules. 

5 Thus, after the array is made, it is "decoded" in order to identify the location of one or more of the 
capture probes, i.e. each subpopulation of beads, on the substrate surface. 

In a preferred embodiment, pyrosequencing techniques are used to decode the array, as is generally 
described in "Nucleic Acid Sequencing using Microsphere Arrays", filed October 22, 1999 (no U.S.S.N. 
1 0 received yet), hereby incorporated by reference. 

In a preferred embodiment, a selective decoding system is used. In this case, only those 
microspheres exhibiting a change in the optical signal as a result of the binding of a target sequence 
are decoded. This is commonly done when the number of "hits", i.e. the number of sites to decode, is 

1 5 generally low. That is, the array is first scanned under experimental conditions in the absence of the 
target sequences. The sample containing the target sequences is added, and only those locations 
exhibiting a change in the optical signal are decoded. For example, the beads at either the positive or 
negative signal locations may be either selectively tagged or released from the array (for example 
through the use of photocleavable linkers), and subsequently sorted or enriched in a fluorescence- 

20 activated cell sorter (FACS). That is, either all the negative beads are released, and then the positive 
beads are either released or analyzed in situ, or alternatively all the positives are released and 
analyzed. Alternatively, the labels may comprise halogenated aromatic compounds, and detection of 
the label is done using for example gas chromatography, chemical tags, isotopic tags mass spectral 
tags. 

25 

As will be appreciated by those in the art, this may also be done in systems where the array is not 
decoded; i.e. there need not ever be a correlation of bead composition with location. In this 
embodiment, the beads are loaded on the array, and the assay is run. The "positives", i.e. those 
beads displaying a change in the optical signal as is more fully outlined below, are then "marked" to 

30 distinguish or separate them from the "negative" beads. This can be done in several ways, preferably 
using fiber optic arrays. In a preferred embodiment, each bead contains a fluorescent dye. After the 
assay and the identification of the "positives" or "active beads", light is shown down either only the 
positive fibers or only the negative fibers, generally in the presence of a light-activated reagent 
(typically dissolved oxygen). In the former case, all the active beads are photobleached. Thus, upon 

35 non-selective release of all the beads with subsequent sorting, for example using a fluorescence 
activated cell sorter (FACS) machine, the non-fluorescent active beads can be sorted from the 
fluorescent negative beads. Alternatively, when light is shown down the negative fibers, all the 
negatives are non-fluorescent and the the postives are fluorescent, and sorting can proceed. The 
characterization of the attached capture probe may be done directly, for example using mass 
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spectroscopy. 



Alternatively, the identification may occur through the use of identifier moieties ("IMs"), which are 
similar to IBLs but need not necessarily bind to DBLs. That is, rather than elucidate the structure of 
5 the capture probe directly, the composition of the IMs may serve as the identifier. Thus, for example, 
a specific combination of IMs can serve to code the bead, and be used to identify the agent on the 
bead upon release from the bead followed by subsequent analysis, for example using a gas 
chromatograph or mass spectroscope. 

1 0 Alternatively, rather than having each bead contain a fluorescent dye, each bead comprises a non- 
fluorescent precursor to a fluorescent dye. For example, using photocleavable protecting groups, 
such as certain ortho-nitrobenzyl groups, on a fluorescent molecule, photoactivation of the 
fluorochrome can be done. After the assay, light is shown down again either the "positive" or the 
"negative" fibers, to distinquish these populations. The illuminated precursors are then chemically 

1 5 converted to a fluorescent dye. All the beads are then released from the array, with sorting, to form 
populations of fluorescent and non-fluorescent beads (either the positives and the negatives or vice 
versa). 

In an alternate preferred embodiment, the sites of attachment of the beads (for example the wells) 
20 include a photopolymerizable reagent, or the photopolymerizable agent is added to the assembled 
array. After the test assay is run, light is shown down again either the "positive" or the "negative" 
fibers, to distinquish these populations. As a result of the irradiation, either all the positives or all the 
negatives are polymerized and trapped or bound to the sites, while the other population of beads can 
be released from the array. 

25 

In a preferred embodiment, the location of every capture probe is determined using decoder binding 
ligands (DBLs). As outlined above, DBLs are binding ligands that will either bind to identifier binding 
ligands, if present, or to the capture probes themselves, preferably when the capture probe is a nucleic 
acid or protein. 

30 

In a preferred embodiment, as outlined above, the DBL binds to the IBL. 

In a preferred embodiment, the capture probes are single-stranded nucleic acids and the DBL is a 
substantially complementary single-stranded nucleic acid that binds (hybridizes) to the capture probe, 
35 termed a decoder probe herein. A decoder probe that is substantially complementary to each 

candidate probe is made and used to decode the array. In this embodiment, the candidate probes and 
the decoder probes should be of sufficient length (and the decoding step run under suitable 
conditions) to allow specificity; i.e. each candidate probe binds to its corresponding decoder probe with 
sufficient specificity to allow the distinction of each candidate probe. 
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In a preferred embodiment, the DBLs are either directly or indirectly labeled. In a preferred 
embodiment, the DBL is directly labeled, that is, the DBL comprises a label. In an alternate 
embodiment, the DBL is indirectly labeled; that is, a labeling binding ligand (LBL) that will bind to the 
DBL is used. In this embodiment, the labeling binding ligand-DBL pair can be as described above for 
5 IBL-DBL pairs. 

Accordingly, the identification of the location of the individual beads (or subpopulations of beads) is 
done using one or more decoding steps comprising a binding between the labeled DBL and either the 
IBL or the capture probe (i.e. a hybridization between the candidate probe and the decoder probe 
1 0 when the capture probe is a nucleic acid). After decoding, the DBLs can be removed and the array 
can be used; however, in some circumstances, for example when the DBL binds to an IBL and not to 
the capture probe, the removal of the DBL is not required (although it may be desirable in some 
circumstances). In addition, as outlined herein, decoding may be done either before the array is used 
to in an assay, during the assay, or after the assay. 

15 

In one embodiment, a single decoding step is done. In this embodiment, each DBL is labeled with a 
unique label, such that the the number of unique tags is equal to or greater than the number of capture 
probes (although in some cases, "reuse" of the unique labels can be done, as described herein; 
similarly, minor variants of candidate probes can share the same decoder, if the variants are encoded 

2Q in another dimension, i.e. in the bead size or label). For each capture probe or IBL, a DBL is made 
that will specifically bind to it and contains a unique tag, for example one or more fluorochromes. 
Thus, the identity of each DBL, both its composition (i.e. its sequence when it is a nucleic acid) and its 
label, is known. Then, by adding the DBLs to the array containing the capture probes under conditions 
which allow the formation of complexes (termed hybridization complexes when the components are 

25 nucleic acids) between the DBLs and either the capture probes or the IBLs, the location of each DBL 
can be elucidated. This allows the identification of the location of each capture probe; the random 
array has been decoded. The DBLs can then be removed, if necessary, and the target sample 
applied. 

30 In a preferred embodiment, the number of unique labels is less than the number of unique capture 

probes, and thus a sequential series of decoding steps are used. In this embodiment, decoder probes 
are divided into n sets for decoding. The number of sets corresponds to the number of unique tags. 
Each decoder probe is labeled in n separate reactions with n distinct tags. All the decoder probes 
share the same n tags. The decoder probes are pooled so that each pool contains only one of the n 

35 tag versions of each decoder, and no two decoder probes have the same sequence of tags across all 
the pools. The number of pools required for this to be true is determined by the number of decoder 
probes and the n. Hybridization of each pool to the array generates a signal at every address. The 
sequential hybridization of each pool in turn will generate a unique, sequence-specific code for each 
candidate probe. This identifies the candidate probe at each address in the array. For example, if four 
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tags are used, then 4 X n sequential hybridizations can ideally distinguish 4 n sequences, although in 
some cases more steps may be required. After the hybridization of each pool, the hybrids are 
denatured and the decoder probes removed, so that the probes are rendered single-stranded for the 
next hybridization (although it is also possible to hybridize limiting amounts of target so that the 
5 available probe is not saturated. Sequential hybridizations can be carried out and analyzed by 
subtracting pre-existing signal from the previous hybridization). 

An example is illustrative. Assuming an array of 16 probe nucleic acids (numbers 1-16), and four 
unique tags (four different fluors, for example; labels A-D). Decoder probes 1-16 are made that 

1 0 correspond to the probes on the beads. The first step is to label decoder probes 1 -4 with tag A, 

decoder probes 5-8 with tag B, decoder probes 9-12 with tag C, and decoder probes 13-16 with tag D. 
The probes are mixed and the pool is contacted with the array containing the beads with the attached 
candidate probes. The location of each tag (and thus each decoder and candidate probe pair) is then 
determined. The first set of decoder probes are then removed. A second set is added, but this time, 

15 decoder probes 1 , 5, 9 and 13 are labeled with tag A, decoder probes 2, 6, 10 and 14 are labeled with 
tag B, decoder probes 3, 7, 1 1 and 15 are labeled with tag C, and decoder probes 4, 8, 12 and 16 are 
labeled with tag D. Thus, those beads that contained tag A in both decoding steps contain candidate 
probe 1 ; tag A in the first decoding step and tag B in the second decoding step contain candidate 
l probe 2; tag A in the first decoding step and tag C in the second step contain candidate probe 3; etc. 

2Q In one embodiment, the decoder probes are labeled in situ; that is, they need not be labeled prior to 
the decoding reaction. In this embodiment, the incoming decoder probe is shorter than the candidate 
probe, creating a 5' "overhang" on the decoding probe. The addition of labeled ddNTPs (each labeled 
with a unique tag) and a polymerase will allow the addition of the tags in a sequence specific manner, 
thus creating a sequence-specific pattern of signals. Similarly, other modifications can be done, 

25 including ligation, etc. 

In addition, since the size of the array will be set by the number of unique decoding binding ligands, it 
is possible to "reuse" a set of unique DBLs to allow for a greater number of test sites. This may be 
done in several ways; for example, by using some subpopulations that comprise optical signatures. 
30 Similarly, the use of a positional coding scheme within an array; different sub-bundles may reuse the 
set of DBLs. Similarly, one embodiment utilizes bead size as a coding modality, thus allowing the 
reuse of the set of unique DBLs for each bead size. Alternatively, sequential partial loading of arrays 
with beads can also allow the reuse of DBLs. Furthermore, "code sharing" can occur as well. 

35 In a preferred embodiment, the DBLs may be reused by having some subpopulations of beads 

comprise optical signatures. In a preferred embodiment, the optical signature is generally a mixture of 
reporter dyes, preferably flourescent. By varying both the composition of the mixture (i.e. the ratio of 
one dye to another) and the concentration of the dye (leading to differences in signal intensity), 
matrices of unique optical signatures may be generated. This may be done by covalently attaching the 
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dyes to the surface of the beads, or alternatively, by entrapping the dye within the bead. 

In a preferred embodiment, the encoding can be accomplished in a ratio of at least two dyes, although 
more encoding dimensions may be added in the size of the beads, for example. In addition, the labels 
5 are distinguishable from one another; thus two different labels may comprise different molecules (i.e. 
two different fluors) or, alternatively, one label at two different concentrations or intensity. 

In a preferred embodiment, the dyes are covalently attached to the surface of the beads. This may be 
done as is generally outlined for the attachment of the capture probes, using functional groups on the 
1 0 surface of the beads. As will be appreciated by those in the art, these attachments are done to 
minimize the effect on the dye. 

In a preferred embodiment, the dyes are non-covalently associated with the beads, generally by 
entrapping the dyes in the pores of the beads. 

15 

:__ Additionally, encoding in the ratios of the two or more dyes, rather than single dye concentrations, is 
-= preferred since it provides insensitivity to the intensity of light used to interrogate the reporter dye's 
signature and detector sensitivity. 

2Q_ In a preferred embodiment, a spatial or positional coding system is done. In this embodiment, there 
\ are sub-bundles or subarrays (i.e. portions of the total array) that are utilized. By analogy with the 
5 telephone system, each subarray is an "area code", that can have the same tags (i.e. telephone 

numbers) of other subarrays, that are separated by virtue of the location of the subarray. Thus, for 
example, the same unique tags can be reused from bundle to bundle. Thus, the use of 50 unique tags 
25 in combination with 1 00 different subarrays can form an array of 5000 different capture probes. In this 
embodiment, it becomes important to be able to identify one bundle from another; in general, this is 
done either manually or through the use of marker beads, i.e. beads containing unique tags for each 
subarray. 

30 In alternative embodiments, additional encoding parameters can be added, such as microsphere size. 
For example, the use of different size beads may also allow the reuse of sets of DBLs; that is, it is 
possible to use microspheres of different sizes to expand the encoding dimensions of the 
microspheres. Optical fiber arrays can be fabricated containing pixels with different fiber diameters or 
cross-sections; alternatively, two or more fiber optic bundles, each with different cross-sections of the 

35 individual fibers, can be added together to form a larger bundle; or, fiber optic bundles with fiber of the 
same size cross-sections can be used, but just with different sized beads. With different diameters, 
the largest wells can be filled with the largest microspheres and then moving onto progressively 
smaller microspheres in the smaller wells until all size wells are then filled. In this manner, the same 
dye ratio could be used to encode microspheres of different sizes thereby expanding the number of 
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different oligonucleotide sequences or chemical functionalities present in the array. Although outlined 
for fiber optic substrates, this as well as the other methods outlined herein can be used with other 
substrates and with other attachment modalities as well. 



In a preferred embodiment, the coding and decoding is accomplished by sequential loading of the 
microspheres into the array. As outlined above for spatial coding, in this embodiment, the optical 
signatures can be "reused". In this embodiment, the library of microspheres each comprising a 
different capture probe (or the subpopulations each comprise a different capture probe), is divided into 
a plurality of sublibraries; for example, depending on the size of the desired array and the number of 
unique tags, 10 sublibraries each comprising roughly 10% of the total library may be made, with each 
sublibrary comprising roughly the same unique tags. Then, the first sublibrary is added to the fiber 
optic bundle comprising the wells, and the location of each capture probe is determined, generally 
through the use of DBLs. The second sublibrary is then added, and the location of each capture probe 
is again determined. The signal in this case will comprise the signal from the "first" DBL and the 
"second" DBL; by comparing the two matrices the location of each bead in each sublibrary can be 
determined. Similarly, adding the third, fourth, etc. sublibraries sequentially will allow the array to be 
filled. 



In a preferred embodiment, codes can be "shared" in several ways. In a first embodiment, a single 
2d code (i.e. IBL/DBL pair) can be assigned to two or more agents if the target sequences different 
sufficiently in their binding strengths. For example, two nucleic acid probes used in an mRNA 
quantitation assay can share the same code if the ranges of their hybridization signal intensities do not 
overlap. This can occur, for example, when one of the target sequences is always present at a much 
higher concentration than the other. Alternatively, the two target sequences might always be present 
25 at a similar concentration, but differ in hybridization efficiency. 

Alternatively, a single code can be assigned to multiple agents if the agents are functionally equivalent. 
For example, if a set of oligonucleotide probes are designed with the common purpose of detecting 
the presence of a particular gene, then the probes are functionally equivalent, even though they may 

30 differ in sequence. Similarly, an array of this type could be used to detect homologs of known genes. 

In this embodiment, each gene is represented by a heterologous set of probes, hybridizing to different 
regions of the gene (and therefore differing in sequence). The set of probes share a common code. If 
a homolog is present, it might hybridize to some but not all of the probes. The level of homology might 
be indicated by the fraction of probes hybridizing, as well as the average hybridization intensity. 

35 Similarly, multiple antibodies to the same protein could all share the same code. 

In a preferred embodiment, decoding of self-assembled random arrays is done on the bases of pH 
titration. In this embodiment, in addition to capture probes, the beads comprise optical signatures, 
wherein the optical signatures are generated by the use of pH-responsive dyes (sometimes referred to 
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herein as "ph dyes") such as fluorophores. This embodiment is similar to that outlined in PCT 
US98/05025 and U.S.S.N. 09/151 ,877, both of which are expressly incorporated by reference, except 
that the dyes used in the present ivention exhibits changes in fluorescence intensity (or other 
properties) when the solution pH is adjusted from below the pKa to above the pKa (or vice versa). In a 
5 preferred embodiment, a set of pH dyes are used, each with a different pKa, preferably separated by 
at least 0.5 pH units. Preferred embodiments utilize a pH dye set of pKa's of 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 
5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 1 0.0, 1 0.5, 1 1 , and 11.5. Each bead can contain any 
subset of the pH dyes, and in this way a unique code for the capture probe is generated. Thus, the 
decoding of an array is achieved by titrating the array from pH 1 to pH 1 3, and measuring the 
1 0 fluorescence signal from each bead as a function of solution pH. 

Thus, the present invention provides array compositions comprising a substrate with a surface 
comprising discrete sites. A population of microspheres is distributed on the sites, and the population 
comprises at least a first and a second subpopulation. Each subpopulation comprises a capture 
15 probe, and, in addition, at least one optical dye with a given pKa. The pKas of the different optical 
:: dyes are different. 

In a preferred embodiment, "random" decoding probes can be made. By sequential hybridizations or 
the use of multiple labels, as is outlined above, a unique hybridization pattern can be generated for 

20 each sensor element. This allows all the beads representing a given clone to be identified as 
d belonging to the same group. In general, this is done by using random or partially degenerate 

decoding probes, that bind in a sequence-dependent but not highly sequence-specific manner. The 
process can be repeated a number of times, each time using a different labeling entity, to generate a 
different pattern of singals based on quasi-specific interactions. In this way, a unique optical signature 

25 is eventually built up for each sensor element. By applying pattern recognition or clustering algorithms 
to the optical signatures, the beads can be grouped into sets that share the same signature (i.e. carry 
the same probes). 

In orderto identify the actual sequence of the clone itself, additional procedures are required; for 
30 example, direct sequencing can be done, or an ordered array containing the clones, such as a spotted 
cDNA array, to generate a "key" that links a hybridization pattern to a specific clone. 

Alternatively, clone arrays can be decoded using binary decoding with vector tags. For example, 
partially randomized oligos are cloned into a nucleic acid vector (e.g. plasmid, phage, etc.). Each 
35 oligonucleotide sequence consists of a subset of a limited set of sequences. For example, if the 
limites set comprises 10 sequences, each oligonucleotide may have some subset (or all of the 10) 
sequences. Thus each of the 1 0 sequences can be present or absent in the oligonucleotide. 
Therefore, there are 2 10 or 1 ,024 possible combinations. The sequences may overlap, and minor 
variants can also be represented (e.g. A, C, T and G substitutions) to increase the number of possible 
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combinations. A nucleic acid library is cloned into a vector containing the random code sequences. 
Alternatively, other methods such as PCR can be used to add the tags. In this way it is possible to use 
a small number of oligo decoding probes to decode an array of clones. 

As will be appreciated by those in the art, the systems of the invention may take on a large number of 
different configurations, as is generally depicted in the Figures. In general, there are three types of 
systems that can be used: (1) "non-sandwich" systems (also referred to herein as "direct" detection) in 
which the target sequence itself is labeled with detectable labels (again, either because the primers 
comprise labels or due to the incorporation of labels into the newly synthesized strand); (2) systems in 
which label probes directly bind to the target analytes; and (3) systems in which label probes are 
indirectly bound to the target sequences, for example through the use of amplifier probes. 

Detection of the reactions of the invention, including the direct detection of products and indirect 
detection utilizing label probes (i.e. sandwich assays), is preferably done by detecting assay 
complexes comprising detectable labels, which can be attached to the assay complex in a variety of 
ways. 

In a preferred embodiment, an array of different and usually artificial capture probes are made; that is, 
the capture probes do not have complementarity to known target sequences. The adapter sequences 
can then be added to any target sequences, or soluble capture extender probes are made; this allows 
the manufacture of only one kind of array, with the user able to customize the array through the use of 
adapter sequences or capture extender probes. This then allows the generation of customized soluble 
probes, which as will be appreciated by those in the art is generally simpler and less costly. 

When capture extender probes are used, in one embodiment, microsphere arrays containing a single 
type of capture probe are made; in this embodiment, the capture extender probes are added to the 
beads prior to loading on the array. The capture extender probes may be additionally fixed or 
crosslinked, as necessary. 

Accordingly, the present invention provides compositions and methods for detecting the presence or 
absence of target analytes, including nucleic acid sequences, in a sample. As will be appreciated by 
those in the art, the sample solution may comprise any number of things, including, but not limited to, 
bodily fluids (including, but not limited to, blood, urine, serum, lymph, saliva, anal and vaginal 
secretions, perspiration and semen, of virtually any organism, with mammalian samples being 
preferred and human samples being particularly preferred); environmental samples (including, but not 
limited to, air, agricultural, water and soil samples); biological warfare agent samples; research 
samples (i.e. in the case of nucleic acids, the sample may be the products of an amplification reaction, 
including both target and signal amplification); purified samples, such as purified genomic DNA, RNA, 
proteins, etc.; raw samples (bacteria, virus, genomic DNA, etc.; As will be appreciated by those in the 
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art, virtually any experimental manipulation may have been done on the sample. 



The present invention provides compositions and methods for detecting the presence or absence of 
target nucleic acid sequences in a sample. 

In a preferred embodiment, several levels of redundancy are built into the arrays of the invention. 
Building redundancy into an array gives several significant advantages, including the ability to make 
quantitative estimates of confidence about the data and signficant increases in sensitivity. Thus, 
preferred embodiments utilize array redundancy. As will be appreciated by those in the art, there are 
at least two types of redundancy that can be built into an array: the use of multiple identical sensor 
elements (termed herein "sensor redundancy"), and the use of multiple sensor elements directed to 
the same target analyte, but comprising different chemical functionalities (termed herein "target 
redundancy"). For example, for the detection of nucleic acids, sensor redundancy utilizes of a plurality 
of sensor elements such as beads comprising identical binding ligands such as probes. Target 
redundancy utilizes sensor elements with different probes to the same target: one probe may span the 
first 25 bases of the target, a second probe may span the second 25 bases of the target, etc. By 
building in either or both of these types of redundancy into an array, significant benefits are obtained. 
For example, a variety of statistical mathematical analyses may be done. 

In addition, while this is generally described herein for bead arrays, as will be appreciated by those in 
the art, this techniques can be used for any type of arrays designed to detect target analytes. 
Furthermore, while these techniques are generally described for nucleic acid systems, these 
techniques are useful in the detection of other binding ligand/target analyte systems as well. 

In a preferred embodiment, sensor redundancy is used. In this embodiment, a plurality of sensor 
elements, e.g. beads, comprising identical bioactive agents are used. That is, each subpopulation 
comprises a plurality of beads comprising identical bioactive agents (e.g. binding ligands). By using a 
number of identical sensor elements for a given array, the optical signal from each sensor element can 
be combined and any number of statistical analyses run, as outlined below. This can be done for a 
variety of reasons. For example, in time varying measurements, redundancy can significantly reduce 
the noise in the system. For non-time based measurements, redundancy can significantly increase 
the confidence of the data. 

In a preferred embodiment, a plurality of identical sensor elements are used. As will be appreciated by 
those in the art, the number of identical sensor elements will vary with the application and use of the 
sensor array. In general, anywhere from 2 to thousands may be used, with from 2 to 100 being 
preferred, 2 to 50 being particularly preferred and from 5 to 20 being especially preferred. In general, 
preliminary results indicate that roughly 10 beads gives a sufficient advantage, although for some 
applications, more identical sensor elements can be used. 
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Once obtained, the optical response signals from a plurality of sensor beads within each bead 
subpopulation can be manipulated and analyzed in a wide variety of ways, including baseline 
adjustment, averaging, standard deviation analysis, distribution and cluster analysis, confidence 
interval analysis, mean testing, etc. 

In a preferred embodiment, the first manipulation of the optical response signals is an optional 
baseline adjustment. In a typical procedure, the standardized optical responses are adjusted to start 
at a value of 0.0 by subtracting the integer 1 .0 from all data points. Doing this allows the baseline-loop 
data to remain at zero even when summed together and the random response signal noise is 
canceled out. When the sample is a fluid, the fluid pulse-loop temporal region, however, frequently 
exhibits a characteristic change in response, either positive, negative or neutral, prior to the sample 
pulse and often requires a baseline adjustment to overcome noise associated with drift in the first few 
data points due to charge buildup in the CCD camera. If no drift is present, typically the baseline from 
the first data point for each bead sensor is subtracted from all the response data for the same bead. If 
drift is observed, the average baseline from the first ten data points for each bead sensor is 
substracted from the all the response data for the same bead. By applying this baseline adjustment, 
when multiple bead responses are added together they can be amplified while the baseline remains at 
zero. Since all beads respond at the same time to the sample (e.g. the sample pulse), they all see the ' 
pulse at the exact same time and there is no registering or adjusting needed for overlaying their 
responses. In addition, other types of baseline adjustment may be done, depending on the 
requirements and output of the system used. 

Once the baseline has been adjusted, a number of possible statistical analyses may be run to 
generate known statistical parameters. Analyses based on redundancy are known and generally 
described in texts such as Freund and Walpole, Mathematical Statistics, Prentice Hall, Inc. New 
Jersey, 1980, hereby incorporated by reference in its entirety. 

In a preferred embodiment, signal summing is done by simply adding the intensity values of all 
responses at each time point, generating a new temporal response comprised of the sum of all bead 
responses. These values can be baseline-adjusted or raw. As for all the analyses described herein, 
signal summing can be performed in real time or during post-data acquisition data reduction and 
analysis. In one embodiment, signal summing is performed with a commercial spreadsheet program 
(Excel, Microsoft, Redmond, WA) after optical response data is collected. 

Methods for signal summing and analyses are included in U.S.S.N. 08/944,850, filed October 6, 1997; 
09/287,573, filed April 6, 1999; and 60/238,866, filed October 6, 2000; an PCT Nos. US98/21 193, filed 
October 6, 1998; and USOO/09183, filed April 6, 2000. 

Once made, the methods and compositions of the invention find use in a number of applications. In a 
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preferred embodiment, the compositions are used to probe a sample solution for the presence or 
absence of a target sequence, including the quantification of the amount of target sequence present. 
The compositions and methods find utility in the detection of genotyping assays and sequencing 
assays, and in all sorts of target analyte assays, including immunoassays. 

For SNP analysis, the ratio of different labels at a particular location on the array indicates the 
homozygosity or heterozygosity of the target sample, assuming the same concentration of each 
readout probe is used. Thus, for example, assuming a first readout probe comprising a first base at 
the readout position with a first detectable label and a second readout probe comprising a second 
base at the readout position with a second detectable label, equal signals (roughly 1 :1 (taking into 
account the different signal intensities of the different labels, different hybridization efficiencies, and 
other reasons)) of the first and second labels indicates a heterozygote. The absence of a signal from 
the first label (or a ratio of approximately 0:1) indicates a homozygote of the second detection base; 
the absence of a signal from the second label (or a ratio of approximately 1:0) indicates a homozygote 
for the first detection base. As is appreciated by those in the art, the actual ratios for any particular 
system are generally determined empirically. 

Generally, a sample containing a target analyte (whether for detection of the target analyte or 
screening for binding partners of the target analyte) is added to the array, under conditions suitable for 
binding of the target analyte to at least one of the capture probes, i.e. generally physiological 
conditions. The presence or absence of the target analyte is then detected. As will be appreciated by 
those in the art, this may be done in a variety of ways, generally through the use of a change in an 
optical signal. This change can occur via many different mechanisms. A few examples include the 
binding of a dye-tagged analyte to the bead, the production of a dye species on or near the beads, the 
destruction of an existing dye species, a change in the optical signature upon analyte interaction with 
dye on bead, or any other optical interrogatabie event. 

In a preferred embodiment, the change in optical signal occurs as a result of the binding of a target 
analyte that is labeled, either directly or indirectly, with a detectable label, preferably an optical label 
such as a fluorochrome. Thus, for example, when a proteinaceous target analyte is used, it may be 
either directly labeled with a fluor, or indirectly, for example through the use of a labeled antibody. 
Similarly, nucleic acids are easily labeled with fluorochromes, for example during PCR amplification 
as is known in the art. Alternatively, upon binding of the target sequences, a hybridization indicator 
may be used as the label. Hybridization indicators preferentially associate with double stranded 
nucleic acid, usually reversibly. Hybridization indicators include intercalators and minor and/or major 
groove binding moieties. In a preferred embodiment, intercalators may be used; since intercalation 
generally only occurs in the presence of double stranded nucleic acid, only in the presence of target 
hybridization will the label light up. Thus, upon binding of the target analyte to a capture probe, there is 
a new optical signal generated at that site, which then may be detected. 
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Alternatively, in some cases, as discussed above, the target analyte such as an enzyme generates a 
species that is either directly or indirectly optical detectable. 

Furthermore, in some embodiments, a change in the optical signature may be the basis of the optical 
signal. For example, the interaction of some chemical target analytes with some fluorescent dyes on 
the beads may alter the optical signature, thus generating a different optical signal. 

As will be appreciated by those in the art, in some embodiments, the presence or absence of the 
target analyte may be done using changes in other optical or non-optical signals, including, but not 
limited to, surface enhanced Raman spectroscopy, surface plasmon resonance, radioactivity, etc. 

The assays may be run under a variety of experimental conditions, as will be appreciated by those in 
the art. A variety of other reagents may be included in the screening assays. These include reagents 
like salts, neutral proteins, e.g. albumin, detergents, etc which may be used to facilitate optimal 
protein-protein binding and/or reduce non-specific or background interactions. Also reagents that 
otherwise improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, 
anti-microbial agents, etc., may be used. The mixture of components may be added in any order that 
provides for the requisite binding. Various blocking and washing steps may be utilized as is known in 
the art. 

The following examples serve to more fully describe the manner of using the above-described 
invention, as well as to set forth the best modes contemplated for carrying out various aspects of the 
invention. It is understood that these examples in no way serve to limit the true scope of this invention, 
but rather are presented for illustrative purposes. All references cited herein are incorporated by 
reference in their entirety. 
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Examples 

Example 1 

Immobilization of Crude Oligonucleotides to a Solid Support 

1 . Introduce chemical functional group (such as -NH2, -COOH, -NCO, -NHS, -SH, -CHO, etc. )onto 
5 solid support. 

2. Activate the functional group before oligonucleotide attachment. 

3. 5'-terminal modified oligonucleotide attachment. 

10 

Crude Oligonucleotides were attached to supports and compared to results from attachment of 
purified oligonucleotides. As demonstrated in Figure 3, in the presence of 2M salt, crude 
oligonucleotides were immobilized as efficiently as purified oligonucleotides. 

1 5 IN addition, the improved attachment of oligonucleotides to a solid support in the presence of 

increased salt was sequence and length independent. Thus, the method finds use in attachment of all 
: oligonucleotides to a solid support (see Figure 4). 

In addition, when 0.5 M to 3 M NaCI was used for attachment of oligonucleotides, non-purified 
20 oligonucleotides were attached with comparable efficiency when compared to purified oligonucleotides 
(see Figure 5). 
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